Efficient Gene Tree Correction Guided by Genome Evolution.

Gene trees inferred solely from multiple alignments of homologous sequences often contain weakly supported and uncertain branches. Information for their full resolution may lie in the dependency between gene families and their genomic context. Integrative methods, using species tree information in a...

Full description

Bibliographic Details
Main Authors: Emmanuel Noutahi, Magali Semeria, Manuel Lafond, Jonathan Seguin, Bastien Boussau, Laurent Guéguen, Nadia El-Mabrouk, Eric Tannier
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2016-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC4981423?pdf=render
id doaj-9641413795f74c629087e492f96bae47
record_format Article
spelling doaj-9641413795f74c629087e492f96bae472020-11-24T20:45:28ZengPublic Library of Science (PLoS)PLoS ONE1932-62032016-01-01118e015955910.1371/journal.pone.0159559Efficient Gene Tree Correction Guided by Genome Evolution.Emmanuel NoutahiMagali SemeriaManuel LafondJonathan SeguinBastien BoussauLaurent GuéguenNadia El-MabroukEric TannierGene trees inferred solely from multiple alignments of homologous sequences often contain weakly supported and uncertain branches. Information for their full resolution may lie in the dependency between gene families and their genomic context. Integrative methods, using species tree information in addition to sequence information, often rely on a computationally intensive tree space search which forecloses an application to large genomic databases.We propose a new method, called ProfileNJ, that takes a gene tree with statistical supports on its branches, and corrects its weakly supported parts by using a combination of information from a species tree and a distance matrix. Its low running time enabled us to use it on the whole Ensembl Compara database, for which we propose an alternative, arguably more plausible set of gene trees. This allowed us to perform a genome-wide analysis of duplication and loss patterns on the history of 63 eukaryote species, and predict ancestral gene content and order for all ancestors along the phylogeny.A web interface called RefineTree, including ProfileNJ as well as a other gene tree correction methods, which we also test on the Ensembl gene families, is available at: http://www-ens.iro.umontreal.ca/~adbit/polytomysolver.html. The code of ProfileNJ as well as the set of gene trees corrected by ProfileNJ from Ensembl Compara version 73 families are also made available.http://europepmc.org/articles/PMC4981423?pdf=render
collection DOAJ
language English
format Article
sources DOAJ
author Emmanuel Noutahi
Magali Semeria
Manuel Lafond
Jonathan Seguin
Bastien Boussau
Laurent Guéguen
Nadia El-Mabrouk
Eric Tannier
spellingShingle Emmanuel Noutahi
Magali Semeria
Manuel Lafond
Jonathan Seguin
Bastien Boussau
Laurent Guéguen
Nadia El-Mabrouk
Eric Tannier
Efficient Gene Tree Correction Guided by Genome Evolution.
PLoS ONE
author_facet Emmanuel Noutahi
Magali Semeria
Manuel Lafond
Jonathan Seguin
Bastien Boussau
Laurent Guéguen
Nadia El-Mabrouk
Eric Tannier
author_sort Emmanuel Noutahi
title Efficient Gene Tree Correction Guided by Genome Evolution.
title_short Efficient Gene Tree Correction Guided by Genome Evolution.
title_full Efficient Gene Tree Correction Guided by Genome Evolution.
title_fullStr Efficient Gene Tree Correction Guided by Genome Evolution.
title_full_unstemmed Efficient Gene Tree Correction Guided by Genome Evolution.
title_sort efficient gene tree correction guided by genome evolution.
publisher Public Library of Science (PLoS)
series PLoS ONE
issn 1932-6203
publishDate 2016-01-01
description Gene trees inferred solely from multiple alignments of homologous sequences often contain weakly supported and uncertain branches. Information for their full resolution may lie in the dependency between gene families and their genomic context. Integrative methods, using species tree information in addition to sequence information, often rely on a computationally intensive tree space search which forecloses an application to large genomic databases.We propose a new method, called ProfileNJ, that takes a gene tree with statistical supports on its branches, and corrects its weakly supported parts by using a combination of information from a species tree and a distance matrix. Its low running time enabled us to use it on the whole Ensembl Compara database, for which we propose an alternative, arguably more plausible set of gene trees. This allowed us to perform a genome-wide analysis of duplication and loss patterns on the history of 63 eukaryote species, and predict ancestral gene content and order for all ancestors along the phylogeny.A web interface called RefineTree, including ProfileNJ as well as a other gene tree correction methods, which we also test on the Ensembl gene families, is available at: http://www-ens.iro.umontreal.ca/~adbit/polytomysolver.html. The code of ProfileNJ as well as the set of gene trees corrected by ProfileNJ from Ensembl Compara version 73 families are also made available.
url http://europepmc.org/articles/PMC4981423?pdf=render
work_keys_str_mv AT emmanuelnoutahi efficientgenetreecorrectionguidedbygenomeevolution
AT magalisemeria efficientgenetreecorrectionguidedbygenomeevolution
AT manuellafond efficientgenetreecorrectionguidedbygenomeevolution
AT jonathanseguin efficientgenetreecorrectionguidedbygenomeevolution
AT bastienboussau efficientgenetreecorrectionguidedbygenomeevolution
AT laurentgueguen efficientgenetreecorrectionguidedbygenomeevolution
AT nadiaelmabrouk efficientgenetreecorrectionguidedbygenomeevolution
AT erictannier efficientgenetreecorrectionguidedbygenomeevolution
_version_ 1716814681362923520