Multi-PLI: interpretable multi‐task deep learning model for unifying protein–ligand interaction datasets

Abstract The assessment of protein–ligand interactions is critical at early stage of drug discovery. Computational approaches for efficiently predicting such interactions facilitate drug development. Recently, methods based on deep learning, including structure- and sequence-based models, have achie...

Full description

Bibliographic Details
Main Authors: Fan Hu, Jiaxin Jiang, Dongqi Wang, Muchun Zhu, Peng Yin
Format: Article
Language:English
Published: BMC 2021-04-01
Series:Journal of Cheminformatics
Subjects:
Online Access:https://doi.org/10.1186/s13321-021-00510-6
id doaj-28244e0eeb884753ad2cf40e77c9ee87
record_format Article
spelling doaj-28244e0eeb884753ad2cf40e77c9ee872021-04-18T11:44:31ZengBMCJournal of Cheminformatics1758-29462021-04-0113111410.1186/s13321-021-00510-6Multi-PLI: interpretable multi‐task deep learning model for unifying protein–ligand interaction datasetsFan Hu0Jiaxin Jiang1Dongqi Wang2Muchun Zhu3Peng Yin4Guangdong-Hong Kong-Macao Joint Laboratory of Human-Machine Intelligence-Synergy Systems, Shenzhen Institutes of Advanced Technology, Chinese Academy of SciencesGuangdong-Hong Kong-Macao Joint Laboratory of Human-Machine Intelligence-Synergy Systems, Shenzhen Institutes of Advanced Technology, Chinese Academy of SciencesGuangdong-Hong Kong-Macao Joint Laboratory of Human-Machine Intelligence-Synergy Systems, Shenzhen Institutes of Advanced Technology, Chinese Academy of SciencesGuangdong-Hong Kong-Macao Joint Laboratory of Human-Machine Intelligence-Synergy Systems, Shenzhen Institutes of Advanced Technology, Chinese Academy of SciencesGuangdong-Hong Kong-Macao Joint Laboratory of Human-Machine Intelligence-Synergy Systems, Shenzhen Institutes of Advanced Technology, Chinese Academy of SciencesAbstract The assessment of protein–ligand interactions is critical at early stage of drug discovery. Computational approaches for efficiently predicting such interactions facilitate drug development. Recently, methods based on deep learning, including structure- and sequence-based models, have achieved impressive performance on several different datasets. However, their application still suffers from a generalizability issue because of insufficient data, especially for structure based models, as well as a heterogeneity problem because of different label measurements and varying proteins across datasets. Here, we present an interpretable multi-task model to evaluate protein–ligand interaction (Multi-PLI). The model can run classification (binding or not) and regression (binding affinity) tasks concurrently by unifying different datasets. The model outperforms traditional docking and machine learning on both binary classification and regression tasks and achieves competitive results compared with some structure-based deep learning methods, even with the same training set size. Furthermore, combined with the proposed occlusion algorithm, the model can predict the important amino acids of proteins that are crucial for binding, thus providing a biological interpretation.https://doi.org/10.1186/s13321-021-00510-6InterpretableDeep learningMulti‐taskDrug discovery
collection DOAJ
language English
format Article
sources DOAJ
author Fan Hu
Jiaxin Jiang
Dongqi Wang
Muchun Zhu
Peng Yin
spellingShingle Fan Hu
Jiaxin Jiang
Dongqi Wang
Muchun Zhu
Peng Yin
Multi-PLI: interpretable multi‐task deep learning model for unifying protein–ligand interaction datasets
Journal of Cheminformatics
Interpretable
Deep learning
Multi‐task
Drug discovery
author_facet Fan Hu
Jiaxin Jiang
Dongqi Wang
Muchun Zhu
Peng Yin
author_sort Fan Hu
title Multi-PLI: interpretable multi‐task deep learning model for unifying protein–ligand interaction datasets
title_short Multi-PLI: interpretable multi‐task deep learning model for unifying protein–ligand interaction datasets
title_full Multi-PLI: interpretable multi‐task deep learning model for unifying protein–ligand interaction datasets
title_fullStr Multi-PLI: interpretable multi‐task deep learning model for unifying protein–ligand interaction datasets
title_full_unstemmed Multi-PLI: interpretable multi‐task deep learning model for unifying protein–ligand interaction datasets
title_sort multi-pli: interpretable multi‐task deep learning model for unifying protein–ligand interaction datasets
publisher BMC
series Journal of Cheminformatics
issn 1758-2946
publishDate 2021-04-01
description Abstract The assessment of protein–ligand interactions is critical at early stage of drug discovery. Computational approaches for efficiently predicting such interactions facilitate drug development. Recently, methods based on deep learning, including structure- and sequence-based models, have achieved impressive performance on several different datasets. However, their application still suffers from a generalizability issue because of insufficient data, especially for structure based models, as well as a heterogeneity problem because of different label measurements and varying proteins across datasets. Here, we present an interpretable multi-task model to evaluate protein–ligand interaction (Multi-PLI). The model can run classification (binding or not) and regression (binding affinity) tasks concurrently by unifying different datasets. The model outperforms traditional docking and machine learning on both binary classification and regression tasks and achieves competitive results compared with some structure-based deep learning methods, even with the same training set size. Furthermore, combined with the proposed occlusion algorithm, the model can predict the important amino acids of proteins that are crucial for binding, thus providing a biological interpretation.
topic Interpretable
Deep learning
Multi‐task
Drug discovery
url https://doi.org/10.1186/s13321-021-00510-6
work_keys_str_mv AT fanhu multipliinterpretablemultitaskdeeplearningmodelforunifyingproteinligandinteractiondatasets
AT jiaxinjiang multipliinterpretablemultitaskdeeplearningmodelforunifyingproteinligandinteractiondatasets
AT dongqiwang multipliinterpretablemultitaskdeeplearningmodelforunifyingproteinligandinteractiondatasets
AT muchunzhu multipliinterpretablemultitaskdeeplearningmodelforunifyingproteinligandinteractiondatasets
AT pengyin multipliinterpretablemultitaskdeeplearningmodelforunifyingproteinligandinteractiondatasets
_version_ 1721521908345733120