ORFcor: identifying and accommodating ORF prediction inconsistencies for phylogenetic analysis.

The high-throughput annotation of open reading frames (ORFs) required by modern genome sequencing projects necessitates computational protocols that sometimes annotate orthologous ORFs inconsistently. Such inconsistencies hinder comparative analyses by non-uniformly extending or truncating 5' a...

Full description

Bibliographic Details
Main Authors: Jonathan L Klassen, Cameron R Currie
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2013-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC3590147?pdf=render
id doaj-32396af81c474c8ebfde41c02038f041
record_format Article
spelling doaj-32396af81c474c8ebfde41c02038f0412020-11-25T01:51:09ZengPublic Library of Science (PLoS)PLoS ONE1932-62032013-01-0183e5838710.1371/journal.pone.0058387ORFcor: identifying and accommodating ORF prediction inconsistencies for phylogenetic analysis.Jonathan L KlassenCameron R CurrieThe high-throughput annotation of open reading frames (ORFs) required by modern genome sequencing projects necessitates computational protocols that sometimes annotate orthologous ORFs inconsistently. Such inconsistencies hinder comparative analyses by non-uniformly extending or truncating 5' and/or 3' sequence ends, causing ORFs that are in fact identical to artificially diverge. Whereas strategies exist to correct such inconsistencies during whole-genome annotation, equivalent software designed to correct subsets of these data without genome reannotation is lacking. We therefore developed ORFcor, which corrects annotation inconsistencies using consensus start and stop positions derived from sets of closely related orthologs. ORFcor corrects inconsistent ORF annotations in diverse test datasets with specificities and sensitivities approaching 100% when sufficiently related orthologs (e.g., from the same taxonomic family) are available for comparison. The ORFcor package is implemented in Perl, multithreaded to handle large datasets, includes related scripts to facilitate high-throughput phylogenomic analyses, and is freely available at www.currielab.wisc.edu/downloads.html.http://europepmc.org/articles/PMC3590147?pdf=render
collection DOAJ
language English
format Article
sources DOAJ
author Jonathan L Klassen
Cameron R Currie
spellingShingle Jonathan L Klassen
Cameron R Currie
ORFcor: identifying and accommodating ORF prediction inconsistencies for phylogenetic analysis.
PLoS ONE
author_facet Jonathan L Klassen
Cameron R Currie
author_sort Jonathan L Klassen
title ORFcor: identifying and accommodating ORF prediction inconsistencies for phylogenetic analysis.
title_short ORFcor: identifying and accommodating ORF prediction inconsistencies for phylogenetic analysis.
title_full ORFcor: identifying and accommodating ORF prediction inconsistencies for phylogenetic analysis.
title_fullStr ORFcor: identifying and accommodating ORF prediction inconsistencies for phylogenetic analysis.
title_full_unstemmed ORFcor: identifying and accommodating ORF prediction inconsistencies for phylogenetic analysis.
title_sort orfcor: identifying and accommodating orf prediction inconsistencies for phylogenetic analysis.
publisher Public Library of Science (PLoS)
series PLoS ONE
issn 1932-6203
publishDate 2013-01-01
description The high-throughput annotation of open reading frames (ORFs) required by modern genome sequencing projects necessitates computational protocols that sometimes annotate orthologous ORFs inconsistently. Such inconsistencies hinder comparative analyses by non-uniformly extending or truncating 5' and/or 3' sequence ends, causing ORFs that are in fact identical to artificially diverge. Whereas strategies exist to correct such inconsistencies during whole-genome annotation, equivalent software designed to correct subsets of these data without genome reannotation is lacking. We therefore developed ORFcor, which corrects annotation inconsistencies using consensus start and stop positions derived from sets of closely related orthologs. ORFcor corrects inconsistent ORF annotations in diverse test datasets with specificities and sensitivities approaching 100% when sufficiently related orthologs (e.g., from the same taxonomic family) are available for comparison. The ORFcor package is implemented in Perl, multithreaded to handle large datasets, includes related scripts to facilitate high-throughput phylogenomic analyses, and is freely available at www.currielab.wisc.edu/downloads.html.
url http://europepmc.org/articles/PMC3590147?pdf=render
work_keys_str_mv AT jonathanlklassen orfcoridentifyingandaccommodatingorfpredictioninconsistenciesforphylogeneticanalysis
AT cameronrcurrie orfcoridentifyingandaccommodatingorfpredictioninconsistenciesforphylogeneticanalysis
_version_ 1724998254663303168