Deep Learning for Drug Discovery: A Study of Identifying High Efficacy Drug Compounds Using a Cascade Transfer Learning Approach

In this research, we applied deep learning to rank the effectiveness of candidate drug compounds in combating viral cells, in particular, SARS-Cov-2 viral cells. For this purpose, two different datasets from Recursion Pharmaceuticals, a siRNA image dataset (RxRx1), which were used to build and calib...

Full description

Bibliographic Details
Main Authors: Dylan Zhuang, Ali K. Ibrahim
Format: Article
Language:English
Published: MDPI AG 2021-08-01
Series:Applied Sciences
Subjects:
Online Access:https://www.mdpi.com/2076-3417/11/17/7772
id doaj-7eb11042e69547d3b068ee561c97a184
record_format Article
spelling doaj-7eb11042e69547d3b068ee561c97a1842021-09-09T13:38:04ZengMDPI AGApplied Sciences2076-34172021-08-01117772777210.3390/app11177772Deep Learning for Drug Discovery: A Study of Identifying High Efficacy Drug Compounds Using a Cascade Transfer Learning ApproachDylan Zhuang0Ali K. Ibrahim1EECS Department, Florida Atlantic University, Boca Raton, FL 33431, USAEECS Department, Florida Atlantic University, Boca Raton, FL 33431, USAIn this research, we applied deep learning to rank the effectiveness of candidate drug compounds in combating viral cells, in particular, SARS-Cov-2 viral cells. For this purpose, two different datasets from Recursion Pharmaceuticals, a siRNA image dataset (RxRx1), which were used to build and calibrate our model for feature extraction, and a SARS-CoV-2 dataset (RxRx19a) was used to train our model for ranking efficacy of candidate drug compounds. The SARS-CoV-2 dataset contained healthy, uninfected control or “mock” cells, as well as “active viral” cells (cells infected with COVID-19), which were the two cell types used to train our deep learning model. In addition, it contains viral cells treated with different drug compounds, which were the cells not used to train but test our model. We devised a new cascade transfer learning strategy to construct our model. We first trained a deep learning model, the DenseNet, with the siRNA set, a dataset with characteristics similar to the SARS-CoV-2 dataset, for feature extraction. We then added additional layers, including a SoftMax layer as an output layer, and retrained the model with active viral cells and mock cells from the SARS-CoV-2 dataset. In the test phase, the SoftMax layer outputs probability (equivalently, efficacy) scores which allows us to rank candidate compounds, and to study the performance of each candidate compound statistically. With this approach, we identified several compounds with high efficacy scores which are promising for the therapeutic treatment of COVID-19. The compounds showing the most promise were GS-441524 and then Remdesivir, which overlapped with these reported in the literature and with these drugs that are approved by FDA, or going through clinical trials and preclinical trials. This study shows the potential of deep learning in its ability to identify promising compounds to aid rapid responses to future pandemic outbreaks.https://www.mdpi.com/2076-3417/11/17/7772COVID-19drug discoverydeep learningcascade transfer learningDenseNet
collection DOAJ
language English
format Article
sources DOAJ
author Dylan Zhuang
Ali K. Ibrahim
spellingShingle Dylan Zhuang
Ali K. Ibrahim
Deep Learning for Drug Discovery: A Study of Identifying High Efficacy Drug Compounds Using a Cascade Transfer Learning Approach
Applied Sciences
COVID-19
drug discovery
deep learning
cascade transfer learning
DenseNet
author_facet Dylan Zhuang
Ali K. Ibrahim
author_sort Dylan Zhuang
title Deep Learning for Drug Discovery: A Study of Identifying High Efficacy Drug Compounds Using a Cascade Transfer Learning Approach
title_short Deep Learning for Drug Discovery: A Study of Identifying High Efficacy Drug Compounds Using a Cascade Transfer Learning Approach
title_full Deep Learning for Drug Discovery: A Study of Identifying High Efficacy Drug Compounds Using a Cascade Transfer Learning Approach
title_fullStr Deep Learning for Drug Discovery: A Study of Identifying High Efficacy Drug Compounds Using a Cascade Transfer Learning Approach
title_full_unstemmed Deep Learning for Drug Discovery: A Study of Identifying High Efficacy Drug Compounds Using a Cascade Transfer Learning Approach
title_sort deep learning for drug discovery: a study of identifying high efficacy drug compounds using a cascade transfer learning approach
publisher MDPI AG
series Applied Sciences
issn 2076-3417
publishDate 2021-08-01
description In this research, we applied deep learning to rank the effectiveness of candidate drug compounds in combating viral cells, in particular, SARS-Cov-2 viral cells. For this purpose, two different datasets from Recursion Pharmaceuticals, a siRNA image dataset (RxRx1), which were used to build and calibrate our model for feature extraction, and a SARS-CoV-2 dataset (RxRx19a) was used to train our model for ranking efficacy of candidate drug compounds. The SARS-CoV-2 dataset contained healthy, uninfected control or “mock” cells, as well as “active viral” cells (cells infected with COVID-19), which were the two cell types used to train our deep learning model. In addition, it contains viral cells treated with different drug compounds, which were the cells not used to train but test our model. We devised a new cascade transfer learning strategy to construct our model. We first trained a deep learning model, the DenseNet, with the siRNA set, a dataset with characteristics similar to the SARS-CoV-2 dataset, for feature extraction. We then added additional layers, including a SoftMax layer as an output layer, and retrained the model with active viral cells and mock cells from the SARS-CoV-2 dataset. In the test phase, the SoftMax layer outputs probability (equivalently, efficacy) scores which allows us to rank candidate compounds, and to study the performance of each candidate compound statistically. With this approach, we identified several compounds with high efficacy scores which are promising for the therapeutic treatment of COVID-19. The compounds showing the most promise were GS-441524 and then Remdesivir, which overlapped with these reported in the literature and with these drugs that are approved by FDA, or going through clinical trials and preclinical trials. This study shows the potential of deep learning in its ability to identify promising compounds to aid rapid responses to future pandemic outbreaks.
topic COVID-19
drug discovery
deep learning
cascade transfer learning
DenseNet
url https://www.mdpi.com/2076-3417/11/17/7772
work_keys_str_mv AT dylanzhuang deeplearningfordrugdiscoveryastudyofidentifyinghighefficacydrugcompoundsusingacascadetransferlearningapproach
AT alikibrahim deeplearningfordrugdiscoveryastudyofidentifyinghighefficacydrugcompoundsusingacascadetransferlearningapproach
_version_ 1717760968168046592