SAMbinder: A Web Server for Predicting S-Adenosyl-L-Methionine Binding Residues of a Protein From Its Amino Acid Sequence

MotivationS-adenosyl-L-methionine (SAM) is an essential cofactor present in the biological system and plays a key role in many diseases. There is a need to develop a method for predicting SAM binding sites in a protein for designing drugs against SAM associated disease. To the best of our knowledge,...

Full description

Bibliographic Details
Main Authors: Piyush Agrawal, Gaurav Mishra, Gajendra P. S. Raghava
Format: Article
Language:English
Published: Frontiers Media S.A. 2020-01-01
Series:Frontiers in Pharmacology
Subjects:
Online Access:https://www.frontiersin.org/article/10.3389/fphar.2019.01690/full
id doaj-d9ff6eb8c0914658b38cb6f297a84dc5
record_format Article
spelling doaj-d9ff6eb8c0914658b38cb6f297a84dc52020-11-25T02:53:48ZengFrontiers Media S.A.Frontiers in Pharmacology1663-98122020-01-011010.3389/fphar.2019.01690497036SAMbinder: A Web Server for Predicting S-Adenosyl-L-Methionine Binding Residues of a Protein From Its Amino Acid SequencePiyush Agrawal0Piyush Agrawal1Gaurav Mishra2Gaurav Mishra3Gajendra P. S. Raghava4Department of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, IndiaBioinformatics Center, CSIR-Institute of Microbial Technology, Chandigarh, IndiaDepartment of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, IndiaDepartment of Electrical Engineering, Shiv Nadar University, Greater Noida, IndiaDepartment of Computational Biology, Indraprastha Institute of Information Technology, New Delhi, IndiaMotivationS-adenosyl-L-methionine (SAM) is an essential cofactor present in the biological system and plays a key role in many diseases. There is a need to develop a method for predicting SAM binding sites in a protein for designing drugs against SAM associated disease. To the best of our knowledge, there is no method that can predict the binding site of SAM in a given protein sequence.ResultThis manuscript describes a method SAMbinder, developed for predicting SAM interacting residue in a protein from its primary sequence. All models were trained, tested, and evaluated on 145 SAM binding protein chains where no two chains have more than 40% sequence similarity. Firstly, models were developed using different machine learning techniques on a balanced data set containing 2,188 SAM interacting and an equal number of non-interacting residues. Our random forest based model developed using binary profile feature got maximum Matthews Correlation Coefficient (MCC) 0.42 with area under receiver operating characteristics (AUROC) 0.79 on the validation data set. The performance of our models improved significantly from MCC 0.42 to 0.61, when evolutionary information in the form of the position-specific scoring matrix (PSSM) profile is used as a feature. We also developed models on a realistic data set containing 2,188 SAM interacting and 40,029 non-interacting residues and got maximum MCC 0.61 with AUROC of 0.89. In order to evaluate the performance of our models, we used internal as well as external cross-validation technique.Availability and Implementationhttps://webs.iiitd.edu.in/raghava/sambinder/.https://www.frontiersin.org/article/10.3389/fphar.2019.01690/fullS-adenosine-L-methioninePSSM profilein silico predictioncancermachine learning technique (MLT)
collection DOAJ
language English
format Article
sources DOAJ
author Piyush Agrawal
Piyush Agrawal
Gaurav Mishra
Gaurav Mishra
Gajendra P. S. Raghava
spellingShingle Piyush Agrawal
Piyush Agrawal
Gaurav Mishra
Gaurav Mishra
Gajendra P. S. Raghava
SAMbinder: A Web Server for Predicting S-Adenosyl-L-Methionine Binding Residues of a Protein From Its Amino Acid Sequence
Frontiers in Pharmacology
S-adenosine-L-methionine
PSSM profile
in silico prediction
cancer
machine learning technique (MLT)
author_facet Piyush Agrawal
Piyush Agrawal
Gaurav Mishra
Gaurav Mishra
Gajendra P. S. Raghava
author_sort Piyush Agrawal
title SAMbinder: A Web Server for Predicting S-Adenosyl-L-Methionine Binding Residues of a Protein From Its Amino Acid Sequence
title_short SAMbinder: A Web Server for Predicting S-Adenosyl-L-Methionine Binding Residues of a Protein From Its Amino Acid Sequence
title_full SAMbinder: A Web Server for Predicting S-Adenosyl-L-Methionine Binding Residues of a Protein From Its Amino Acid Sequence
title_fullStr SAMbinder: A Web Server for Predicting S-Adenosyl-L-Methionine Binding Residues of a Protein From Its Amino Acid Sequence
title_full_unstemmed SAMbinder: A Web Server for Predicting S-Adenosyl-L-Methionine Binding Residues of a Protein From Its Amino Acid Sequence
title_sort sambinder: a web server for predicting s-adenosyl-l-methionine binding residues of a protein from its amino acid sequence
publisher Frontiers Media S.A.
series Frontiers in Pharmacology
issn 1663-9812
publishDate 2020-01-01
description MotivationS-adenosyl-L-methionine (SAM) is an essential cofactor present in the biological system and plays a key role in many diseases. There is a need to develop a method for predicting SAM binding sites in a protein for designing drugs against SAM associated disease. To the best of our knowledge, there is no method that can predict the binding site of SAM in a given protein sequence.ResultThis manuscript describes a method SAMbinder, developed for predicting SAM interacting residue in a protein from its primary sequence. All models were trained, tested, and evaluated on 145 SAM binding protein chains where no two chains have more than 40% sequence similarity. Firstly, models were developed using different machine learning techniques on a balanced data set containing 2,188 SAM interacting and an equal number of non-interacting residues. Our random forest based model developed using binary profile feature got maximum Matthews Correlation Coefficient (MCC) 0.42 with area under receiver operating characteristics (AUROC) 0.79 on the validation data set. The performance of our models improved significantly from MCC 0.42 to 0.61, when evolutionary information in the form of the position-specific scoring matrix (PSSM) profile is used as a feature. We also developed models on a realistic data set containing 2,188 SAM interacting and 40,029 non-interacting residues and got maximum MCC 0.61 with AUROC of 0.89. In order to evaluate the performance of our models, we used internal as well as external cross-validation technique.Availability and Implementationhttps://webs.iiitd.edu.in/raghava/sambinder/.
topic S-adenosine-L-methionine
PSSM profile
in silico prediction
cancer
machine learning technique (MLT)
url https://www.frontiersin.org/article/10.3389/fphar.2019.01690/full
work_keys_str_mv AT piyushagrawal sambinderawebserverforpredictingsadenosyllmethioninebindingresiduesofaproteinfromitsaminoacidsequence
AT piyushagrawal sambinderawebserverforpredictingsadenosyllmethioninebindingresiduesofaproteinfromitsaminoacidsequence
AT gauravmishra sambinderawebserverforpredictingsadenosyllmethioninebindingresiduesofaproteinfromitsaminoacidsequence
AT gauravmishra sambinderawebserverforpredictingsadenosyllmethioninebindingresiduesofaproteinfromitsaminoacidsequence
AT gajendrapsraghava sambinderawebserverforpredictingsadenosyllmethioninebindingresiduesofaproteinfromitsaminoacidsequence
_version_ 1724724407681679360