Maximising the size of non-redundant protein datasets using graph theory.

Analysis of protein data sets often requires prior removal of redundancy, so that data is not biased by containing similar proteins. This is usually achieved by pairwise comparison of sequences, followed by purging so that no two pairs have similarities above a chosen threshold. From a starting set,...

Full description

Bibliographic Details
Main Authors: Simon C Bull, Mark R Muldoon, Andrew J Doig
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2013-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC3564766?pdf=render