Identification of Predictive Cis-Regulatory Elements Using a Discriminative Objective Function and a Dynamic Search Space.

The generation of genomic binding or accessibility data from massively parallel sequencing technologies such as ChIP-seq and DNase-seq continues to accelerate. Yet state-of-the-art computational approaches for the identification of DNA binding motifs often yield motifs of weak predictive power. Here...

Full description

Bibliographic Details
Main Authors: Rahul Karnik, Michael A Beer
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2015-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC4605740?pdf=render
id doaj-098783d1129743058084d25587ede89b
record_format Article
spelling doaj-098783d1129743058084d25587ede89b2020-11-25T01:51:13ZengPublic Library of Science (PLoS)PLoS ONE1932-62032015-01-011010e014055710.1371/journal.pone.0140557Identification of Predictive Cis-Regulatory Elements Using a Discriminative Objective Function and a Dynamic Search Space.Rahul KarnikMichael A BeerThe generation of genomic binding or accessibility data from massively parallel sequencing technologies such as ChIP-seq and DNase-seq continues to accelerate. Yet state-of-the-art computational approaches for the identification of DNA binding motifs often yield motifs of weak predictive power. Here we present a novel computational algorithm called MotifSpec, designed to find predictive motifs, in contrast to over-represented sequence elements. The key distinguishing feature of this algorithm is that it uses a dynamic search space and a learned threshold to find discriminative motifs in combination with the modeling of motifs using a full PWM (position weight matrix) rather than k-mer words or regular expressions. We demonstrate that our approach finds motifs corresponding to known binding specificities in several mammalian ChIP-seq datasets, and that our PWMs classify the ChIP-seq signals with accuracy comparable to, or marginally better than motifs from the best existing algorithms. In other datasets, our algorithm identifies novel motifs where other methods fail. Finally, we apply this algorithm to detect motifs from expression datasets in C. elegans using a dynamic expression similarity metric rather than fixed expression clusters, and find novel predictive motifs.http://europepmc.org/articles/PMC4605740?pdf=render
collection DOAJ
language English
format Article
sources DOAJ
author Rahul Karnik
Michael A Beer
spellingShingle Rahul Karnik
Michael A Beer
Identification of Predictive Cis-Regulatory Elements Using a Discriminative Objective Function and a Dynamic Search Space.
PLoS ONE
author_facet Rahul Karnik
Michael A Beer
author_sort Rahul Karnik
title Identification of Predictive Cis-Regulatory Elements Using a Discriminative Objective Function and a Dynamic Search Space.
title_short Identification of Predictive Cis-Regulatory Elements Using a Discriminative Objective Function and a Dynamic Search Space.
title_full Identification of Predictive Cis-Regulatory Elements Using a Discriminative Objective Function and a Dynamic Search Space.
title_fullStr Identification of Predictive Cis-Regulatory Elements Using a Discriminative Objective Function and a Dynamic Search Space.
title_full_unstemmed Identification of Predictive Cis-Regulatory Elements Using a Discriminative Objective Function and a Dynamic Search Space.
title_sort identification of predictive cis-regulatory elements using a discriminative objective function and a dynamic search space.
publisher Public Library of Science (PLoS)
series PLoS ONE
issn 1932-6203
publishDate 2015-01-01
description The generation of genomic binding or accessibility data from massively parallel sequencing technologies such as ChIP-seq and DNase-seq continues to accelerate. Yet state-of-the-art computational approaches for the identification of DNA binding motifs often yield motifs of weak predictive power. Here we present a novel computational algorithm called MotifSpec, designed to find predictive motifs, in contrast to over-represented sequence elements. The key distinguishing feature of this algorithm is that it uses a dynamic search space and a learned threshold to find discriminative motifs in combination with the modeling of motifs using a full PWM (position weight matrix) rather than k-mer words or regular expressions. We demonstrate that our approach finds motifs corresponding to known binding specificities in several mammalian ChIP-seq datasets, and that our PWMs classify the ChIP-seq signals with accuracy comparable to, or marginally better than motifs from the best existing algorithms. In other datasets, our algorithm identifies novel motifs where other methods fail. Finally, we apply this algorithm to detect motifs from expression datasets in C. elegans using a dynamic expression similarity metric rather than fixed expression clusters, and find novel predictive motifs.
url http://europepmc.org/articles/PMC4605740?pdf=render
work_keys_str_mv AT rahulkarnik identificationofpredictivecisregulatoryelementsusingadiscriminativeobjectivefunctionandadynamicsearchspace
AT michaelabeer identificationofpredictivecisregulatoryelementsusingadiscriminativeobjectivefunctionandadynamicsearchspace
_version_ 1724997778833145856