DeepGRN: prediction of transcription factor binding site across cell-types using attention-based deep neural networks

Background: Due to the complexity of the biological systems, the prediction of the potential DNA binding sites for transcription factors remains a difficult problem in computational biology. Genomic DNA sequences and experimental results from parallel sequencing provide available information about t...

Full description

Bibliographic Details
Main Authors: Birchler, J.A (Author), Chen, C. (Author), Cheng, J. (Author), Hou, J. (Author), Shi, X. (Author), Yang, H. (Author)
Format: Article
Language:English
Published: BioMed Central Ltd 2021
Subjects:
DNA
Online Access:View Fulltext in Publisher
LEADER 03758nam a2200601Ia 4500
001 10.1186-s12859-020-03952-1
008 220427s2021 CNT 000 0 und d
020 |a 14712105 (ISSN) 
245 1 0 |a DeepGRN: prediction of transcription factor binding site across cell-types using attention-based deep neural networks 
260 0 |b BioMed Central Ltd  |c 2021 
856 |z View Fulltext in Publisher  |u https://doi.org/10.1186/s12859-020-03952-1 
520 3 |a Background: Due to the complexity of the biological systems, the prediction of the potential DNA binding sites for transcription factors remains a difficult problem in computational biology. Genomic DNA sequences and experimental results from parallel sequencing provide available information about the affinity and accessibility of genome and are commonly used features in binding sites prediction. The attention mechanism in deep learning has shown its capability to learn long-range dependencies from sequential data, such as sentences and voices. Until now, no study has applied this approach in binding site inference from massively parallel sequencing data. The successful applications of attention mechanism in similar input contexts motivate us to build and test new methods that can accurately determine the binding sites of transcription factors. Results: In this study, we propose a novel tool (named DeepGRN) for transcription factors binding site prediction based on the combination of two components: single attention module and pairwise attention module. The performance of our methods is evaluated on the ENCODE-DREAM in vivo Transcription Factor Binding Site Prediction Challenge datasets. The results show that DeepGRN achieves higher unified scores in 6 of 13 targets than any of the top four methods in the DREAM challenge. We also demonstrate that the attention weights learned by the model are correlated with potential informative inputs, such as DNase-Seq coverage and motifs, which provide possible explanations for the predictive improvements in DeepGRN. Conclusions: DeepGRN can automatically and effectively predict transcription factor binding sites from DNA sequences and DNase-Seq coverage. Furthermore, the visualization techniques we developed for the attention modules help to interpret how critical patterns from different types of input features are recognized by our model. © 2021, The Author(s). 
650 0 4 |a Attention mechanism 
650 0 4 |a Attention mechanisms 
650 0 4 |a binding site 
650 0 4 |a Binding site predictions 
650 0 4 |a Binding sites 
650 0 4 |a Binding Sites 
650 0 4 |a biology 
650 0 4 |a chromatin 
650 0 4 |a Chromatin 
650 0 4 |a Computational biology 
650 0 4 |a Computational Biology 
650 0 4 |a Deep learning 
650 0 4 |a Deep neural networks 
650 0 4 |a DNA 
650 0 4 |a DNA binding site prediction 
650 0 4 |a DNA sequences 
650 0 4 |a Forecasting 
650 0 4 |a Gene encoding 
650 0 4 |a genetics 
650 0 4 |a Long-range dependencies 
650 0 4 |a Massively parallel sequencing 
650 0 4 |a metabolism 
650 0 4 |a Neural networks 
650 0 4 |a Neural Networks, Computer 
650 0 4 |a Parallel sequencing 
650 0 4 |a protein binding 
650 0 4 |a Protein Binding 
650 0 4 |a Transcription 
650 0 4 |a transcription factor 
650 0 4 |a Transcription factor 
650 0 4 |a Transcription factor binding sites 
650 0 4 |a Transcription factors 
650 0 4 |a Transcription Factors 
650 0 4 |a Visualization technique 
700 1 |a Birchler, J.A.  |e author 
700 1 |a Chen, C.  |e author 
700 1 |a Cheng, J.  |e author 
700 1 |a Hou, J.  |e author 
700 1 |a Shi, X.  |e author 
700 1 |a Yang, H.  |e author 
773 |t BMC Bioinformatics