Considering transposable element diversification in de novo annotation approaches.

Transposable elements (TEs) are mobile, repetitive DNA sequences that are almost ubiquitous in prokaryotic and eukaryotic genomes. They have a large impact on genome structure, function and evolution. With the recent development of high-throughput sequencing methods, many genome sequences have becom...

Full description

Bibliographic Details
Main Authors: Timothée Flutre, Elodie Duprat, Catherine Feuillet, Hadi Quesneville
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2011-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC3031573?pdf=render
id doaj-af749e2945d14183bc74d5db84101cd5
record_format Article
spelling doaj-af749e2945d14183bc74d5db84101cd52020-11-25T00:52:36ZengPublic Library of Science (PLoS)PLoS ONE1932-62032011-01-0161e1652610.1371/journal.pone.0016526Considering transposable element diversification in de novo annotation approaches.Timothée FlutreElodie DupratCatherine FeuilletHadi QuesnevilleTransposable elements (TEs) are mobile, repetitive DNA sequences that are almost ubiquitous in prokaryotic and eukaryotic genomes. They have a large impact on genome structure, function and evolution. With the recent development of high-throughput sequencing methods, many genome sequences have become available, making possible comparative studies of TE dynamics at an unprecedented scale. Several methods have been proposed for the de novo identification of TEs in sequenced genomes. Most begin with the detection of genomic repeats, but the subsequent steps for defining TE families differ. High-quality TE annotations are available for the Drosophila melanogaster and Arabidopsis thaliana genome sequences, providing a solid basis for the benchmarking of such methods. We compared the performance of specific algorithms for the clustering of interspersed repeats and found that only a particular combination of algorithms detected TE families with good recovery of the reference sequences. We then applied a new procedure for reconciling the different clustering results and classifying TE sequences. The whole approach was implemented in a pipeline using the REPET package. Finally, we show that our combined approach highlights the dynamics of well defined TE families by making it possible to identify structural variations among their copies. This approach makes it possible to annotate TE families and to study their diversification in a single analysis, improving our understanding of TE dynamics at the whole-genome scale and for diverse species.http://europepmc.org/articles/PMC3031573?pdf=render
collection DOAJ
language English
format Article
sources DOAJ
author Timothée Flutre
Elodie Duprat
Catherine Feuillet
Hadi Quesneville
spellingShingle Timothée Flutre
Elodie Duprat
Catherine Feuillet
Hadi Quesneville
Considering transposable element diversification in de novo annotation approaches.
PLoS ONE
author_facet Timothée Flutre
Elodie Duprat
Catherine Feuillet
Hadi Quesneville
author_sort Timothée Flutre
title Considering transposable element diversification in de novo annotation approaches.
title_short Considering transposable element diversification in de novo annotation approaches.
title_full Considering transposable element diversification in de novo annotation approaches.
title_fullStr Considering transposable element diversification in de novo annotation approaches.
title_full_unstemmed Considering transposable element diversification in de novo annotation approaches.
title_sort considering transposable element diversification in de novo annotation approaches.
publisher Public Library of Science (PLoS)
series PLoS ONE
issn 1932-6203
publishDate 2011-01-01
description Transposable elements (TEs) are mobile, repetitive DNA sequences that are almost ubiquitous in prokaryotic and eukaryotic genomes. They have a large impact on genome structure, function and evolution. With the recent development of high-throughput sequencing methods, many genome sequences have become available, making possible comparative studies of TE dynamics at an unprecedented scale. Several methods have been proposed for the de novo identification of TEs in sequenced genomes. Most begin with the detection of genomic repeats, but the subsequent steps for defining TE families differ. High-quality TE annotations are available for the Drosophila melanogaster and Arabidopsis thaliana genome sequences, providing a solid basis for the benchmarking of such methods. We compared the performance of specific algorithms for the clustering of interspersed repeats and found that only a particular combination of algorithms detected TE families with good recovery of the reference sequences. We then applied a new procedure for reconciling the different clustering results and classifying TE sequences. The whole approach was implemented in a pipeline using the REPET package. Finally, we show that our combined approach highlights the dynamics of well defined TE families by making it possible to identify structural variations among their copies. This approach makes it possible to annotate TE families and to study their diversification in a single analysis, improving our understanding of TE dynamics at the whole-genome scale and for diverse species.
url http://europepmc.org/articles/PMC3031573?pdf=render
work_keys_str_mv AT timotheeflutre consideringtransposableelementdiversificationindenovoannotationapproaches
AT elodieduprat consideringtransposableelementdiversificationindenovoannotationapproaches
AT catherinefeuillet consideringtransposableelementdiversificationindenovoannotationapproaches
AT hadiquesneville consideringtransposableelementdiversificationindenovoannotationapproaches
_version_ 1725241397617885184