Transposon identification using profile HMMs

<p>Abstract</p> <p>Background</p> <p>Transposons are "jumping genes" that account for large quantities of repetitive content in genomes. They are known to affect transcriptional regulation in several different ways, and are implicated in many human diseases. T...

Full description

Bibliographic Details
Main Authors: Liu Jun S, Edlefsen Paul T
Format: Article
Language:English
Published: BMC 2010-02-01
Series:BMC Genomics
id doaj-a42f509854a84d8fb52622e0f6103c61
record_format Article
spelling doaj-a42f509854a84d8fb52622e0f6103c612020-11-25T02:10:49ZengBMCBMC Genomics1471-21642010-02-0111Suppl 1S1010.1186/1471-2164-11-S1-S10Transposon identification using profile HMMsLiu Jun SEdlefsen Paul T<p>Abstract</p> <p>Background</p> <p>Transposons are "jumping genes" that account for large quantities of repetitive content in genomes. They are known to affect transcriptional regulation in several different ways, and are implicated in many human diseases. Transposons are related to microRNAs and viruses, and many genes, pseudogenes, and gene promoters are derived from transposons or have origins in transposon-induced duplication. Modeling transposon-derived genomic content is difficult because they are poorly conserved. Profile hidden Markov models (profile HMMs), widely used for protein sequence family modeling, are rarely used for modeling DNA sequence families. The algorithm commonly used to estimate the parameters of profile HMMs, Baum-Welch, is prone to prematurely converge to local optima. The DNA domain is especially problematic for the Baum-Welch algorithm, since it has only four letters as opposed to the twenty residues of the amino acid alphabet.</p> <p>Results</p> <p>We demonstrate with a simulation study and with an application to modeling the MIR family of transposons that two recently introduced methods, Conditional Baum-Welch and Dynamic Model Surgery, achieve better estimates of the parameters of profile HMMs across a range of conditions.</p> <p>Conclusions</p> <p>We argue that these new algorithms expand the range of potential applications of profile HMMs to many important DNA sequence family modeling problems, including that of searching for and modeling the virus-like transposons that are found in all known genomes.</p>
collection DOAJ
language English
format Article
sources DOAJ
author Liu Jun S
Edlefsen Paul T
spellingShingle Liu Jun S
Edlefsen Paul T
Transposon identification using profile HMMs
BMC Genomics
author_facet Liu Jun S
Edlefsen Paul T
author_sort Liu Jun S
title Transposon identification using profile HMMs
title_short Transposon identification using profile HMMs
title_full Transposon identification using profile HMMs
title_fullStr Transposon identification using profile HMMs
title_full_unstemmed Transposon identification using profile HMMs
title_sort transposon identification using profile hmms
publisher BMC
series BMC Genomics
issn 1471-2164
publishDate 2010-02-01
description <p>Abstract</p> <p>Background</p> <p>Transposons are "jumping genes" that account for large quantities of repetitive content in genomes. They are known to affect transcriptional regulation in several different ways, and are implicated in many human diseases. Transposons are related to microRNAs and viruses, and many genes, pseudogenes, and gene promoters are derived from transposons or have origins in transposon-induced duplication. Modeling transposon-derived genomic content is difficult because they are poorly conserved. Profile hidden Markov models (profile HMMs), widely used for protein sequence family modeling, are rarely used for modeling DNA sequence families. The algorithm commonly used to estimate the parameters of profile HMMs, Baum-Welch, is prone to prematurely converge to local optima. The DNA domain is especially problematic for the Baum-Welch algorithm, since it has only four letters as opposed to the twenty residues of the amino acid alphabet.</p> <p>Results</p> <p>We demonstrate with a simulation study and with an application to modeling the MIR family of transposons that two recently introduced methods, Conditional Baum-Welch and Dynamic Model Surgery, achieve better estimates of the parameters of profile HMMs across a range of conditions.</p> <p>Conclusions</p> <p>We argue that these new algorithms expand the range of potential applications of profile HMMs to many important DNA sequence family modeling problems, including that of searching for and modeling the virus-like transposons that are found in all known genomes.</p>
work_keys_str_mv AT liujuns transposonidentificationusingprofilehmms
AT edlefsenpault transposonidentificationusingprofilehmms
_version_ 1724917253569249280