Comparing biological information contained in mRNA and non-coding RNAs for classification of lung cancer patients

Abstract Background Deciphering the meaning of the human DNA is an outstanding goal which would revolutionize medicine and our way for treating diseases. In recent years, non-coding RNAs have attracted much attention and shown to be functional in part. Yet the importance of these RNAs especially for...

Full description

Bibliographic Details
Main Authors: Johannes Smolander, Alexey Stupnikov, Galina Glazko, Matthias Dehmer, Frank Emmert-Streib
Format: Article
Language:English
Published: BMC 2019-12-01
Series:BMC Cancer
Subjects:
Online Access:https://doi.org/10.1186/s12885-019-6338-1
id doaj-19d2fd4b1bb24fba953db4cdaf2340e1
record_format Article
spelling doaj-19d2fd4b1bb24fba953db4cdaf2340e12020-12-06T12:53:54ZengBMCBMC Cancer1471-24072019-12-0119111510.1186/s12885-019-6338-1Comparing biological information contained in mRNA and non-coding RNAs for classification of lung cancer patientsJohannes Smolander0Alexey Stupnikov1Galina Glazko2Matthias Dehmer3Frank Emmert-Streib4Predictive Society and Data Analytics Lab, Faculty of Information Technology and Communication Sciences, Tampere UniversityDepartment of Oncology, School of Medicine, Johns Hopkins UniversityDepartment of Biomedical Informatics, University of Arkansas for Medical SciencesInstitute for Intelligent Production, Faculty for Management, University of Applied Sciences Upper AustriaPredictive Society and Data Analytics Lab, Faculty of Information Technology and Communication Sciences, Tampere UniversityAbstract Background Deciphering the meaning of the human DNA is an outstanding goal which would revolutionize medicine and our way for treating diseases. In recent years, non-coding RNAs have attracted much attention and shown to be functional in part. Yet the importance of these RNAs especially for higher biological functions remains under investigation. Methods In this paper, we analyze RNA-seq data, including non-coding and protein coding RNAs, from lung adenocarcinoma patients, a histologic subtype of non-small-cell lung cancer, with deep learning neural networks and other state-of-the-art classification methods. The purpose of our paper is three-fold. First, we compare the classification performance of different versions of deep belief networks with SVMs, decision trees and random forests. Second, we compare the classification capabilities of protein coding and non-coding RNAs. Third, we study the influence of feature selection on the classification performance. Results As a result, we find that deep belief networks perform at least competitively to other state-of-the-art classifiers. Second, data from non-coding RNAs perform better than coding RNAs across a number of different classification methods. This demonstrates the equivalence of predictive information as captured by non-coding RNAs compared to protein coding RNAs, conventionally used in computational diagnostics tasks. Third, we find that feature selection has in general a negative effect on the classification performance which means that unfiltered data with all features give the best classification results. Conclusions Our study is the first to use ncRNAs beyond miRNAs for the computational classification of cancer and for performing a direct comparison of the classification capabilities of protein coding RNAs and non-coding RNAs.https://doi.org/10.1186/s12885-019-6338-1Deep learningDeep belief networkClassificationNon-coding RNALung cancer and Machine learning
collection DOAJ
language English
format Article
sources DOAJ
author Johannes Smolander
Alexey Stupnikov
Galina Glazko
Matthias Dehmer
Frank Emmert-Streib
spellingShingle Johannes Smolander
Alexey Stupnikov
Galina Glazko
Matthias Dehmer
Frank Emmert-Streib
Comparing biological information contained in mRNA and non-coding RNAs for classification of lung cancer patients
BMC Cancer
Deep learning
Deep belief network
Classification
Non-coding RNA
Lung cancer and Machine learning
author_facet Johannes Smolander
Alexey Stupnikov
Galina Glazko
Matthias Dehmer
Frank Emmert-Streib
author_sort Johannes Smolander
title Comparing biological information contained in mRNA and non-coding RNAs for classification of lung cancer patients
title_short Comparing biological information contained in mRNA and non-coding RNAs for classification of lung cancer patients
title_full Comparing biological information contained in mRNA and non-coding RNAs for classification of lung cancer patients
title_fullStr Comparing biological information contained in mRNA and non-coding RNAs for classification of lung cancer patients
title_full_unstemmed Comparing biological information contained in mRNA and non-coding RNAs for classification of lung cancer patients
title_sort comparing biological information contained in mrna and non-coding rnas for classification of lung cancer patients
publisher BMC
series BMC Cancer
issn 1471-2407
publishDate 2019-12-01
description Abstract Background Deciphering the meaning of the human DNA is an outstanding goal which would revolutionize medicine and our way for treating diseases. In recent years, non-coding RNAs have attracted much attention and shown to be functional in part. Yet the importance of these RNAs especially for higher biological functions remains under investigation. Methods In this paper, we analyze RNA-seq data, including non-coding and protein coding RNAs, from lung adenocarcinoma patients, a histologic subtype of non-small-cell lung cancer, with deep learning neural networks and other state-of-the-art classification methods. The purpose of our paper is three-fold. First, we compare the classification performance of different versions of deep belief networks with SVMs, decision trees and random forests. Second, we compare the classification capabilities of protein coding and non-coding RNAs. Third, we study the influence of feature selection on the classification performance. Results As a result, we find that deep belief networks perform at least competitively to other state-of-the-art classifiers. Second, data from non-coding RNAs perform better than coding RNAs across a number of different classification methods. This demonstrates the equivalence of predictive information as captured by non-coding RNAs compared to protein coding RNAs, conventionally used in computational diagnostics tasks. Third, we find that feature selection has in general a negative effect on the classification performance which means that unfiltered data with all features give the best classification results. Conclusions Our study is the first to use ncRNAs beyond miRNAs for the computational classification of cancer and for performing a direct comparison of the classification capabilities of protein coding RNAs and non-coding RNAs.
topic Deep learning
Deep belief network
Classification
Non-coding RNA
Lung cancer and Machine learning
url https://doi.org/10.1186/s12885-019-6338-1
work_keys_str_mv AT johannessmolander comparingbiologicalinformationcontainedinmrnaandnoncodingrnasforclassificationoflungcancerpatients
AT alexeystupnikov comparingbiologicalinformationcontainedinmrnaandnoncodingrnasforclassificationoflungcancerpatients
AT galinaglazko comparingbiologicalinformationcontainedinmrnaandnoncodingrnasforclassificationoflungcancerpatients
AT matthiasdehmer comparingbiologicalinformationcontainedinmrnaandnoncodingrnasforclassificationoflungcancerpatients
AT frankemmertstreib comparingbiologicalinformationcontainedinmrnaandnoncodingrnasforclassificationoflungcancerpatients
_version_ 1724398477262192640