Adverse drug reaction prediction using scores produced by large-scale drug-protein target docking on high-performance computing machines.
Late-stage or post-market identification of adverse drug reactions (ADRs) is a significant public health issue and a source of major economic liability for drug development. Thus, reliable in silico screening of drug candidates for possible ADRs would be advantageous. In this work, we introduce a co...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Public Library of Science (PLoS)
2014-01-01
|
Series: | PLoS ONE |
Online Access: | http://europepmc.org/articles/PMC4156361?pdf=render |
id |
doaj-3f5e154ee4fe418c802150f7e9b7a8dd |
---|---|
record_format |
Article |
spelling |
doaj-3f5e154ee4fe418c802150f7e9b7a8dd2020-11-24T21:32:00ZengPublic Library of Science (PLoS)PLoS ONE1932-62032014-01-0199e10629810.1371/journal.pone.0106298Adverse drug reaction prediction using scores produced by large-scale drug-protein target docking on high-performance computing machines.Montiago X LaButeXiaohua ZhangJason LendermanBrian J BennionSergio E WongFelice C LightstoneLate-stage or post-market identification of adverse drug reactions (ADRs) is a significant public health issue and a source of major economic liability for drug development. Thus, reliable in silico screening of drug candidates for possible ADRs would be advantageous. In this work, we introduce a computational approach that predicts ADRs by combining the results of molecular docking and leverages known ADR information from DrugBank and SIDER. We employed a recently parallelized version of AutoDock Vina (VinaLC) to dock 906 small molecule drugs to a virtual panel of 409 DrugBank protein targets. L1-regularized logistic regression models were trained on the resulting docking scores of a 560 compound subset from the initial 906 compounds to predict 85 side effects, grouped into 10 ADR phenotype groups. Only 21% (87 out of 409) of the drug-protein binding features involve known targets of the drug subset, providing a significant probe of off-target effects. As a control, associations of this drug subset with the 555 annotated targets of these compounds, as reported in DrugBank, were used as features to train a separate group of models. The Vina off-target models and the DrugBank on-target models yielded comparable median area-under-the-receiver-operating-characteristic-curves (AUCs) during 10-fold cross-validation (0.60-0.69 and 0.61-0.74, respectively). Evidence was found in the PubMed literature to support several putative ADR-protein associations identified by our analysis. Among them, several associations between neoplasm-related ADRs and known tumor suppressor and tumor invasiveness marker proteins were found. A dual role for interstitial collagenase in both neoplasms and aneurysm formation was also identified. These associations all involve off-target proteins and could not have been found using available drug/on-target interaction data. This study illustrates a path forward to comprehensive ADR virtual screening that can potentially scale with increasing number of CPUs to tens of thousands of protein targets and millions of potential drug candidates.http://europepmc.org/articles/PMC4156361?pdf=render |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Montiago X LaBute Xiaohua Zhang Jason Lenderman Brian J Bennion Sergio E Wong Felice C Lightstone |
spellingShingle |
Montiago X LaBute Xiaohua Zhang Jason Lenderman Brian J Bennion Sergio E Wong Felice C Lightstone Adverse drug reaction prediction using scores produced by large-scale drug-protein target docking on high-performance computing machines. PLoS ONE |
author_facet |
Montiago X LaBute Xiaohua Zhang Jason Lenderman Brian J Bennion Sergio E Wong Felice C Lightstone |
author_sort |
Montiago X LaBute |
title |
Adverse drug reaction prediction using scores produced by large-scale drug-protein target docking on high-performance computing machines. |
title_short |
Adverse drug reaction prediction using scores produced by large-scale drug-protein target docking on high-performance computing machines. |
title_full |
Adverse drug reaction prediction using scores produced by large-scale drug-protein target docking on high-performance computing machines. |
title_fullStr |
Adverse drug reaction prediction using scores produced by large-scale drug-protein target docking on high-performance computing machines. |
title_full_unstemmed |
Adverse drug reaction prediction using scores produced by large-scale drug-protein target docking on high-performance computing machines. |
title_sort |
adverse drug reaction prediction using scores produced by large-scale drug-protein target docking on high-performance computing machines. |
publisher |
Public Library of Science (PLoS) |
series |
PLoS ONE |
issn |
1932-6203 |
publishDate |
2014-01-01 |
description |
Late-stage or post-market identification of adverse drug reactions (ADRs) is a significant public health issue and a source of major economic liability for drug development. Thus, reliable in silico screening of drug candidates for possible ADRs would be advantageous. In this work, we introduce a computational approach that predicts ADRs by combining the results of molecular docking and leverages known ADR information from DrugBank and SIDER. We employed a recently parallelized version of AutoDock Vina (VinaLC) to dock 906 small molecule drugs to a virtual panel of 409 DrugBank protein targets. L1-regularized logistic regression models were trained on the resulting docking scores of a 560 compound subset from the initial 906 compounds to predict 85 side effects, grouped into 10 ADR phenotype groups. Only 21% (87 out of 409) of the drug-protein binding features involve known targets of the drug subset, providing a significant probe of off-target effects. As a control, associations of this drug subset with the 555 annotated targets of these compounds, as reported in DrugBank, were used as features to train a separate group of models. The Vina off-target models and the DrugBank on-target models yielded comparable median area-under-the-receiver-operating-characteristic-curves (AUCs) during 10-fold cross-validation (0.60-0.69 and 0.61-0.74, respectively). Evidence was found in the PubMed literature to support several putative ADR-protein associations identified by our analysis. Among them, several associations between neoplasm-related ADRs and known tumor suppressor and tumor invasiveness marker proteins were found. A dual role for interstitial collagenase in both neoplasms and aneurysm formation was also identified. These associations all involve off-target proteins and could not have been found using available drug/on-target interaction data. This study illustrates a path forward to comprehensive ADR virtual screening that can potentially scale with increasing number of CPUs to tens of thousands of protein targets and millions of potential drug candidates. |
url |
http://europepmc.org/articles/PMC4156361?pdf=render |
work_keys_str_mv |
AT montiagoxlabute adversedrugreactionpredictionusingscoresproducedbylargescaledrugproteintargetdockingonhighperformancecomputingmachines AT xiaohuazhang adversedrugreactionpredictionusingscoresproducedbylargescaledrugproteintargetdockingonhighperformancecomputingmachines AT jasonlenderman adversedrugreactionpredictionusingscoresproducedbylargescaledrugproteintargetdockingonhighperformancecomputingmachines AT brianjbennion adversedrugreactionpredictionusingscoresproducedbylargescaledrugproteintargetdockingonhighperformancecomputingmachines AT sergioewong adversedrugreactionpredictionusingscoresproducedbylargescaledrugproteintargetdockingonhighperformancecomputingmachines AT feliceclightstone adversedrugreactionpredictionusingscoresproducedbylargescaledrugproteintargetdockingonhighperformancecomputingmachines |
_version_ |
1725959010505457664 |