Interpretable deep neural network for cancer survival analysis by integrating genomic and clinical data

Abstract Background Understanding the complex biological mechanisms of cancer patient survival using genomic and clinical data is vital, not only to develop new treatments for patients, but also to improve survival prediction. However, highly nonlinear and high-dimension, low-sample size (HDLSS) dat...

Full description

Bibliographic Details
Main Authors: Jie Hao, Youngsoon Kim, Tejaswini Mallavarapu, Jung Hun Oh, Mingon Kang
Format: Article
Language:English
Published: BMC 2019-12-01
Series:BMC Medical Genomics
Subjects:
Online Access:https://doi.org/10.1186/s12920-019-0624-2
id doaj-30aecae4d42c4b49a2aea6df3e654e04
record_format Article
spelling doaj-30aecae4d42c4b49a2aea6df3e654e042021-04-02T17:48:54ZengBMCBMC Medical Genomics1755-87942019-12-0112S1011310.1186/s12920-019-0624-2Interpretable deep neural network for cancer survival analysis by integrating genomic and clinical dataJie Hao0Youngsoon Kim1Tejaswini Mallavarapu2Jung Hun Oh3Mingon Kang4Department of Biostatistics, Epidemiology and Informatics, University of PennsylvaniaDepartment of Computer Science, Kennesaw State UniversityAnalytics and Data Science Institute, Kennesaw State UniversityDepartment of Medical Physics, Memorial Sloan Kettering Cancer CenterDepartment of Computer Science, University of Nevada, Las VegasAbstract Background Understanding the complex biological mechanisms of cancer patient survival using genomic and clinical data is vital, not only to develop new treatments for patients, but also to improve survival prediction. However, highly nonlinear and high-dimension, low-sample size (HDLSS) data cause computational challenges to applying conventional survival analysis. Results We propose a novel biologically interpretable pathway-based sparse deep neural network, named Cox-PASNet, which integrates high-dimensional gene expression data and clinical data on a simple neural network architecture for survival analysis. Cox-PASNet is biologically interpretable where nodes in the neural network correspond to biological genes and pathways, while capturing the nonlinear and hierarchical effects of biological pathways associated with cancer patient survival. We also propose a heuristic optimization solution to train Cox-PASNet with HDLSS data. Cox-PASNet was intensively evaluated by comparing the predictive performance of current state-of-the-art methods on glioblastoma multiforme (GBM) and ovarian serous cystadenocarcinoma (OV) cancer. In the experiments, Cox-PASNet showed out-performance, compared to the benchmarking methods. Moreover, the neural network architecture of Cox-PASNet was biologically interpreted, and several significant prognostic factors of genes and biological pathways were identified. Conclusions Cox-PASNet models biological mechanisms in the neural network by incorporating biological pathway databases and sparse coding. The neural network of Cox-PASNet can identify nonlinear and hierarchical associations of genomic and clinical data to cancer patient survival. The open-source code of Cox-PASNet in PyTorch implemented for training, evaluation, and model interpretation is available at: https://github.com/DataX-JieHao/Cox-PASNet.https://doi.org/10.1186/s12920-019-0624-2Cox-PASNetDeep neural networkSurvival analysisGlioblastoma multiformeOvarian cancer
collection DOAJ
language English
format Article
sources DOAJ
author Jie Hao
Youngsoon Kim
Tejaswini Mallavarapu
Jung Hun Oh
Mingon Kang
spellingShingle Jie Hao
Youngsoon Kim
Tejaswini Mallavarapu
Jung Hun Oh
Mingon Kang
Interpretable deep neural network for cancer survival analysis by integrating genomic and clinical data
BMC Medical Genomics
Cox-PASNet
Deep neural network
Survival analysis
Glioblastoma multiforme
Ovarian cancer
author_facet Jie Hao
Youngsoon Kim
Tejaswini Mallavarapu
Jung Hun Oh
Mingon Kang
author_sort Jie Hao
title Interpretable deep neural network for cancer survival analysis by integrating genomic and clinical data
title_short Interpretable deep neural network for cancer survival analysis by integrating genomic and clinical data
title_full Interpretable deep neural network for cancer survival analysis by integrating genomic and clinical data
title_fullStr Interpretable deep neural network for cancer survival analysis by integrating genomic and clinical data
title_full_unstemmed Interpretable deep neural network for cancer survival analysis by integrating genomic and clinical data
title_sort interpretable deep neural network for cancer survival analysis by integrating genomic and clinical data
publisher BMC
series BMC Medical Genomics
issn 1755-8794
publishDate 2019-12-01
description Abstract Background Understanding the complex biological mechanisms of cancer patient survival using genomic and clinical data is vital, not only to develop new treatments for patients, but also to improve survival prediction. However, highly nonlinear and high-dimension, low-sample size (HDLSS) data cause computational challenges to applying conventional survival analysis. Results We propose a novel biologically interpretable pathway-based sparse deep neural network, named Cox-PASNet, which integrates high-dimensional gene expression data and clinical data on a simple neural network architecture for survival analysis. Cox-PASNet is biologically interpretable where nodes in the neural network correspond to biological genes and pathways, while capturing the nonlinear and hierarchical effects of biological pathways associated with cancer patient survival. We also propose a heuristic optimization solution to train Cox-PASNet with HDLSS data. Cox-PASNet was intensively evaluated by comparing the predictive performance of current state-of-the-art methods on glioblastoma multiforme (GBM) and ovarian serous cystadenocarcinoma (OV) cancer. In the experiments, Cox-PASNet showed out-performance, compared to the benchmarking methods. Moreover, the neural network architecture of Cox-PASNet was biologically interpreted, and several significant prognostic factors of genes and biological pathways were identified. Conclusions Cox-PASNet models biological mechanisms in the neural network by incorporating biological pathway databases and sparse coding. The neural network of Cox-PASNet can identify nonlinear and hierarchical associations of genomic and clinical data to cancer patient survival. The open-source code of Cox-PASNet in PyTorch implemented for training, evaluation, and model interpretation is available at: https://github.com/DataX-JieHao/Cox-PASNet.
topic Cox-PASNet
Deep neural network
Survival analysis
Glioblastoma multiforme
Ovarian cancer
url https://doi.org/10.1186/s12920-019-0624-2
work_keys_str_mv AT jiehao interpretabledeepneuralnetworkforcancersurvivalanalysisbyintegratinggenomicandclinicaldata
AT youngsoonkim interpretabledeepneuralnetworkforcancersurvivalanalysisbyintegratinggenomicandclinicaldata
AT tejaswinimallavarapu interpretabledeepneuralnetworkforcancersurvivalanalysisbyintegratinggenomicandclinicaldata
AT junghunoh interpretabledeepneuralnetworkforcancersurvivalanalysisbyintegratinggenomicandclinicaldata
AT mingonkang interpretabledeepneuralnetworkforcancersurvivalanalysisbyintegratinggenomicandclinicaldata
_version_ 1721553204512030720