Machine learning predicts nucleosome binding modes of transcription factors

Background: Most transcription factors (TFs) compete with nucleosomes to gain access to their cognate binding sites. Recent studies have identified several TF-nucleosome interaction modes including end binding (EB), oriented binding, periodic binding, dyad binding, groove binding, and gyre spanning....

Full description

Bibliographic Details
Main Authors: Cui, F. (Author), Kishan, K.C (Author), Li, R. (Author), Subramanya, S.K (Author)
Format: Article
Language:English
Published: BioMed Central Ltd 2021
Subjects:
Online Access:View Fulltext in Publisher
LEADER 02825nam a2200529Ia 4500
001 10.1186-s12859-021-04093-9
008 220427s2021 CNT 000 0 und d
020 |a 14712105 (ISSN) 
245 1 0 |a Machine learning predicts nucleosome binding modes of transcription factors 
260 0 |b BioMed Central Ltd  |c 2021 
856 |z View Fulltext in Publisher  |u https://doi.org/10.1186/s12859-021-04093-9 
520 3 |a Background: Most transcription factors (TFs) compete with nucleosomes to gain access to their cognate binding sites. Recent studies have identified several TF-nucleosome interaction modes including end binding (EB), oriented binding, periodic binding, dyad binding, groove binding, and gyre spanning. However, there are substantial experimental challenges in measuring nucleosome binding modes for thousands of TFs in different species. Results: We present a computational prediction of the binding modes based on TF protein sequences. With a nested cross-validation procedure, our model outperforms several fine-tuned off-the-shelf machine learning (ML) methods in the multi-label classification task. Our binary classifier for the EB mode performs better than these ML methods with the area under precision-recall curve achieving 75%. The end preference of most TFs is consistent with low nucleosome occupancy around their binding site in GM12878 cells. The nucleosome occupancy data is used as an alternative dataset to confirm the superiority of our EB classifier. Conclusions: We develop the first ML-based approach for efficient and comprehensive analysis of nucleosome binding modes of TFs. © 2021, The Author(s). 
650 0 4 |a amino acid sequence 
650 0 4 |a Amino Acid Sequence 
650 0 4 |a Binary classifiers 
650 0 4 |a Binding energy 
650 0 4 |a binding site 
650 0 4 |a Binding sites 
650 0 4 |a Binding Sites 
650 0 4 |a Classification (of information) 
650 0 4 |a Comprehensive analysis 
650 0 4 |a Computational predictions 
650 0 4 |a genetics 
650 0 4 |a Interaction modes 
650 0 4 |a machine learning 
650 0 4 |a Machine learning 
650 0 4 |a Machine learning 
650 0 4 |a Machine Learning 
650 0 4 |a metabolism 
650 0 4 |a Multi label classification 
650 0 4 |a Nested cross validations 
650 0 4 |a nucleosome 
650 0 4 |a Nucleosome binding modes 
650 0 4 |a Nucleosomes 
650 0 4 |a Off-the-shelf machine 
650 0 4 |a protein binding 
650 0 4 |a Protein Binding 
650 0 4 |a Protein sequences 
650 0 4 |a transcription factor 
650 0 4 |a Transcription factors 
650 0 4 |a Transcription factors 
650 0 4 |a Transcription Factors 
700 1 |a Cui, F.  |e author 
700 1 |a Kishan, K.C.  |e author 
700 1 |a Li, R.  |e author 
700 1 |a Subramanya, S.K.  |e author 
773 |t BMC Bioinformatics