EnACP: An Ensemble Learning Model for Identification of Anticancer Peptides
As cancer remains one of the main threats of human life, developing efficient cancer treatments is urgent. Anticancer peptides, which could overcome the significant side effects and poor results of traditional cancer treatments, have become a new potential alternative these years. However, identifyi...
Main Authors: | , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Frontiers Media S.A.
2020-07-01
|
Series: | Frontiers in Genetics |
Subjects: | |
Online Access: | https://www.frontiersin.org/article/10.3389/fgene.2020.00760/full |
id |
doaj-d7eafb2fa92742b1b4f8b2c899b652f7 |
---|---|
record_format |
Article |
spelling |
doaj-d7eafb2fa92742b1b4f8b2c899b652f72020-11-25T03:07:38ZengFrontiers Media S.A.Frontiers in Genetics1664-80212020-07-011110.3389/fgene.2020.00760553906EnACP: An Ensemble Learning Model for Identification of Anticancer PeptidesRuiquan Ge0Guanwen Feng1Xiaoyang Jing2Renfeng Zhang3Pu Wang4Qing Wu5Key Laboratory of Complex Systems Modeling and Simulation, School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou, ChinaXi'an Key Laboratory of Big Data and Intelligent Vision, School of Computer Science and Technology, Xidian University, Xi'an, ChinaToyota Technological Institute at Chicago, Chicago, IL, United StatesShandong Provincial Hospital Affiliated to Shandong First Medical University, Jinan, ChinaComputer School, Hubei University of Arts and Science, Xiangyang, ChinaKey Laboratory of Complex Systems Modeling and Simulation, School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou, ChinaAs cancer remains one of the main threats of human life, developing efficient cancer treatments is urgent. Anticancer peptides, which could overcome the significant side effects and poor results of traditional cancer treatments, have become a new potential alternative these years. However, identifying anticancer peptides by experimental methods is time consuming and resource consuming, it is of great significance to develop effective computational tools to quickly and accurately identify potential anticancer peptides from amino acid sequences. For most current computational methods, feature representation plays a key role in their final successes. This study proposes a novel fast and accurate approach to identify anticancer peptides using diversified feature representations and ensemble learning method. For the feature representations, the information is encoded from multidimensional feature spaces, including sequence composition, sequence-order, physicochemical properties, etc. In order to better model the potential relationships of peptides, multiple ensemble classifiers, LightGBMs, are applied to detect the different feature sets at first. Then the obtained multiple outputs are used as inputs of the support vector machine classifier, which effectively identifies anticancer peptides. Experimental results on cross validation and independent test sets demonstrate that our method can achieve better or comparable performances compared with other state-of-the-art methods.https://www.frontiersin.org/article/10.3389/fgene.2020.00760/fullanticancer peptidesfeature representationensemble learningpseudo amino acid compositionsystem biology |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Ruiquan Ge Guanwen Feng Xiaoyang Jing Renfeng Zhang Pu Wang Qing Wu |
spellingShingle |
Ruiquan Ge Guanwen Feng Xiaoyang Jing Renfeng Zhang Pu Wang Qing Wu EnACP: An Ensemble Learning Model for Identification of Anticancer Peptides Frontiers in Genetics anticancer peptides feature representation ensemble learning pseudo amino acid composition system biology |
author_facet |
Ruiquan Ge Guanwen Feng Xiaoyang Jing Renfeng Zhang Pu Wang Qing Wu |
author_sort |
Ruiquan Ge |
title |
EnACP: An Ensemble Learning Model for Identification of Anticancer Peptides |
title_short |
EnACP: An Ensemble Learning Model for Identification of Anticancer Peptides |
title_full |
EnACP: An Ensemble Learning Model for Identification of Anticancer Peptides |
title_fullStr |
EnACP: An Ensemble Learning Model for Identification of Anticancer Peptides |
title_full_unstemmed |
EnACP: An Ensemble Learning Model for Identification of Anticancer Peptides |
title_sort |
enacp: an ensemble learning model for identification of anticancer peptides |
publisher |
Frontiers Media S.A. |
series |
Frontiers in Genetics |
issn |
1664-8021 |
publishDate |
2020-07-01 |
description |
As cancer remains one of the main threats of human life, developing efficient cancer treatments is urgent. Anticancer peptides, which could overcome the significant side effects and poor results of traditional cancer treatments, have become a new potential alternative these years. However, identifying anticancer peptides by experimental methods is time consuming and resource consuming, it is of great significance to develop effective computational tools to quickly and accurately identify potential anticancer peptides from amino acid sequences. For most current computational methods, feature representation plays a key role in their final successes. This study proposes a novel fast and accurate approach to identify anticancer peptides using diversified feature representations and ensemble learning method. For the feature representations, the information is encoded from multidimensional feature spaces, including sequence composition, sequence-order, physicochemical properties, etc. In order to better model the potential relationships of peptides, multiple ensemble classifiers, LightGBMs, are applied to detect the different feature sets at first. Then the obtained multiple outputs are used as inputs of the support vector machine classifier, which effectively identifies anticancer peptides. Experimental results on cross validation and independent test sets demonstrate that our method can achieve better or comparable performances compared with other state-of-the-art methods. |
topic |
anticancer peptides feature representation ensemble learning pseudo amino acid composition system biology |
url |
https://www.frontiersin.org/article/10.3389/fgene.2020.00760/full |
work_keys_str_mv |
AT ruiquange enacpanensemblelearningmodelforidentificationofanticancerpeptides AT guanwenfeng enacpanensemblelearningmodelforidentificationofanticancerpeptides AT xiaoyangjing enacpanensemblelearningmodelforidentificationofanticancerpeptides AT renfengzhang enacpanensemblelearningmodelforidentificationofanticancerpeptides AT puwang enacpanensemblelearningmodelforidentificationofanticancerpeptides AT qingwu enacpanensemblelearningmodelforidentificationofanticancerpeptides |
_version_ |
1724669202443272192 |