A ChIP-Seq benchmark shows that sequence conservation mainly improves detection of strong transcription factor binding sites.
BACKGROUND: Transcription factors are important controllers of gene expression and mapping transcription factor binding sites (TFBS) is key to inferring transcription factor regulatory networks. Several methods for predicting TFBS exist, but there are no standard genome-wide datasets on which to ass...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Public Library of Science (PLoS)
2011-01-01
|
Series: | PLoS ONE |
Online Access: | http://europepmc.org/articles/PMC3077367?pdf=render |
id |
doaj-2214246fc4534d568129be362c1d4fe5 |
---|---|
record_format |
Article |
spelling |
doaj-2214246fc4534d568129be362c1d4fe52020-11-25T02:39:29ZengPublic Library of Science (PLoS)PLoS ONE1932-62032011-01-0164e1843010.1371/journal.pone.0018430A ChIP-Seq benchmark shows that sequence conservation mainly improves detection of strong transcription factor binding sites.Tony HåndstadMorten Beck RyeFinn DrabløsPål SætromBACKGROUND: Transcription factors are important controllers of gene expression and mapping transcription factor binding sites (TFBS) is key to inferring transcription factor regulatory networks. Several methods for predicting TFBS exist, but there are no standard genome-wide datasets on which to assess the performance of these prediction methods. Also, it is believed that information about sequence conservation across different genomes can generally improve accuracy of motif-based predictors, but it is not clear under what circumstances use of conservation is most beneficial. RESULTS: Here we use published ChIP-seq data and an improved peak detection method to create comprehensive benchmark datasets for prediction methods which use known descriptors or binding motifs to detect TFBS in genomic sequences. We use this benchmark to assess the performance of five different prediction methods and find that the methods that use information about sequence conservation generally perform better than simpler motif-scanning methods. The difference is greater on high-affinity peaks and when using short and information-poor motifs. However, if the motifs are specific and information-rich, we find that simple motif-scanning methods can perform better than conservation-based methods. CONCLUSIONS: Our benchmark provides a comprehensive test that can be used to rank the relative performance of transcription factor binding site prediction methods. Moreover, our results show that, contrary to previous reports, sequence conservation is better suited for predicting strong than weak transcription factor binding sites.http://europepmc.org/articles/PMC3077367?pdf=render |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Tony Håndstad Morten Beck Rye Finn Drabløs Pål Sætrom |
spellingShingle |
Tony Håndstad Morten Beck Rye Finn Drabløs Pål Sætrom A ChIP-Seq benchmark shows that sequence conservation mainly improves detection of strong transcription factor binding sites. PLoS ONE |
author_facet |
Tony Håndstad Morten Beck Rye Finn Drabløs Pål Sætrom |
author_sort |
Tony Håndstad |
title |
A ChIP-Seq benchmark shows that sequence conservation mainly improves detection of strong transcription factor binding sites. |
title_short |
A ChIP-Seq benchmark shows that sequence conservation mainly improves detection of strong transcription factor binding sites. |
title_full |
A ChIP-Seq benchmark shows that sequence conservation mainly improves detection of strong transcription factor binding sites. |
title_fullStr |
A ChIP-Seq benchmark shows that sequence conservation mainly improves detection of strong transcription factor binding sites. |
title_full_unstemmed |
A ChIP-Seq benchmark shows that sequence conservation mainly improves detection of strong transcription factor binding sites. |
title_sort |
chip-seq benchmark shows that sequence conservation mainly improves detection of strong transcription factor binding sites. |
publisher |
Public Library of Science (PLoS) |
series |
PLoS ONE |
issn |
1932-6203 |
publishDate |
2011-01-01 |
description |
BACKGROUND: Transcription factors are important controllers of gene expression and mapping transcription factor binding sites (TFBS) is key to inferring transcription factor regulatory networks. Several methods for predicting TFBS exist, but there are no standard genome-wide datasets on which to assess the performance of these prediction methods. Also, it is believed that information about sequence conservation across different genomes can generally improve accuracy of motif-based predictors, but it is not clear under what circumstances use of conservation is most beneficial. RESULTS: Here we use published ChIP-seq data and an improved peak detection method to create comprehensive benchmark datasets for prediction methods which use known descriptors or binding motifs to detect TFBS in genomic sequences. We use this benchmark to assess the performance of five different prediction methods and find that the methods that use information about sequence conservation generally perform better than simpler motif-scanning methods. The difference is greater on high-affinity peaks and when using short and information-poor motifs. However, if the motifs are specific and information-rich, we find that simple motif-scanning methods can perform better than conservation-based methods. CONCLUSIONS: Our benchmark provides a comprehensive test that can be used to rank the relative performance of transcription factor binding site prediction methods. Moreover, our results show that, contrary to previous reports, sequence conservation is better suited for predicting strong than weak transcription factor binding sites. |
url |
http://europepmc.org/articles/PMC3077367?pdf=render |
work_keys_str_mv |
AT tonyhandstad achipseqbenchmarkshowsthatsequenceconservationmainlyimprovesdetectionofstrongtranscriptionfactorbindingsites AT mortenbeckrye achipseqbenchmarkshowsthatsequenceconservationmainlyimprovesdetectionofstrongtranscriptionfactorbindingsites AT finndrabløs achipseqbenchmarkshowsthatsequenceconservationmainlyimprovesdetectionofstrongtranscriptionfactorbindingsites AT palsætrom achipseqbenchmarkshowsthatsequenceconservationmainlyimprovesdetectionofstrongtranscriptionfactorbindingsites AT tonyhandstad chipseqbenchmarkshowsthatsequenceconservationmainlyimprovesdetectionofstrongtranscriptionfactorbindingsites AT mortenbeckrye chipseqbenchmarkshowsthatsequenceconservationmainlyimprovesdetectionofstrongtranscriptionfactorbindingsites AT finndrabløs chipseqbenchmarkshowsthatsequenceconservationmainlyimprovesdetectionofstrongtranscriptionfactorbindingsites AT palsætrom chipseqbenchmarkshowsthatsequenceconservationmainlyimprovesdetectionofstrongtranscriptionfactorbindingsites |
_version_ |
1724785911047127040 |