Whole Proteome Clustering of 2,307 Proteobacterial Genomes Reveals Conserved Proteins and Significant Annotation Issues

We clustered 8.76 M protein sequences deduced from 2,307 completely sequenced Proteobacterial genomes resulting in 707,311 clusters of one or more sequences of which 224,442 ranged in size from 2 to 2,894 sequences. To our knowledge this is the first study of this scale. We were surprised to find th...

Full description

Bibliographic Details
Main Authors: Svetlana Lockwood, Kelly A. Brayton, Jeff A. Daily, Shira L. Broschat
Format: Article
Language:English
Published: Frontiers Media S.A. 2019-02-01
Series:Frontiers in Microbiology
Subjects:
Online Access:https://www.frontiersin.org/article/10.3389/fmicb.2019.00383/full