EnACP: An Ensemble Learning Model for Identification of Anticancer Peptides

As cancer remains one of the main threats of human life, developing efficient cancer treatments is urgent. Anticancer peptides, which could overcome the significant side effects and poor results of traditional cancer treatments, have become a new potential alternative these years. However, identifyi...

Full description

Bibliographic Details
Main Authors: Ruiquan Ge, Guanwen Feng, Xiaoyang Jing, Renfeng Zhang, Pu Wang, Qing Wu
Format: Article
Language:English
Published: Frontiers Media S.A. 2020-07-01
Series:Frontiers in Genetics
Subjects:
Online Access:https://www.frontiersin.org/article/10.3389/fgene.2020.00760/full
id doaj-d7eafb2fa92742b1b4f8b2c899b652f7
record_format Article
spelling doaj-d7eafb2fa92742b1b4f8b2c899b652f72020-11-25T03:07:38ZengFrontiers Media S.A.Frontiers in Genetics1664-80212020-07-011110.3389/fgene.2020.00760553906EnACP: An Ensemble Learning Model for Identification of Anticancer PeptidesRuiquan Ge0Guanwen Feng1Xiaoyang Jing2Renfeng Zhang3Pu Wang4Qing Wu5Key Laboratory of Complex Systems Modeling and Simulation, School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou, ChinaXi'an Key Laboratory of Big Data and Intelligent Vision, School of Computer Science and Technology, Xidian University, Xi'an, ChinaToyota Technological Institute at Chicago, Chicago, IL, United StatesShandong Provincial Hospital Affiliated to Shandong First Medical University, Jinan, ChinaComputer School, Hubei University of Arts and Science, Xiangyang, ChinaKey Laboratory of Complex Systems Modeling and Simulation, School of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou, ChinaAs cancer remains one of the main threats of human life, developing efficient cancer treatments is urgent. Anticancer peptides, which could overcome the significant side effects and poor results of traditional cancer treatments, have become a new potential alternative these years. However, identifying anticancer peptides by experimental methods is time consuming and resource consuming, it is of great significance to develop effective computational tools to quickly and accurately identify potential anticancer peptides from amino acid sequences. For most current computational methods, feature representation plays a key role in their final successes. This study proposes a novel fast and accurate approach to identify anticancer peptides using diversified feature representations and ensemble learning method. For the feature representations, the information is encoded from multidimensional feature spaces, including sequence composition, sequence-order, physicochemical properties, etc. In order to better model the potential relationships of peptides, multiple ensemble classifiers, LightGBMs, are applied to detect the different feature sets at first. Then the obtained multiple outputs are used as inputs of the support vector machine classifier, which effectively identifies anticancer peptides. Experimental results on cross validation and independent test sets demonstrate that our method can achieve better or comparable performances compared with other state-of-the-art methods.https://www.frontiersin.org/article/10.3389/fgene.2020.00760/fullanticancer peptidesfeature representationensemble learningpseudo amino acid compositionsystem biology
collection DOAJ
language English
format Article
sources DOAJ
author Ruiquan Ge
Guanwen Feng
Xiaoyang Jing
Renfeng Zhang
Pu Wang
Qing Wu
spellingShingle Ruiquan Ge
Guanwen Feng
Xiaoyang Jing
Renfeng Zhang
Pu Wang
Qing Wu
EnACP: An Ensemble Learning Model for Identification of Anticancer Peptides
Frontiers in Genetics
anticancer peptides
feature representation
ensemble learning
pseudo amino acid composition
system biology
author_facet Ruiquan Ge
Guanwen Feng
Xiaoyang Jing
Renfeng Zhang
Pu Wang
Qing Wu
author_sort Ruiquan Ge
title EnACP: An Ensemble Learning Model for Identification of Anticancer Peptides
title_short EnACP: An Ensemble Learning Model for Identification of Anticancer Peptides
title_full EnACP: An Ensemble Learning Model for Identification of Anticancer Peptides
title_fullStr EnACP: An Ensemble Learning Model for Identification of Anticancer Peptides
title_full_unstemmed EnACP: An Ensemble Learning Model for Identification of Anticancer Peptides
title_sort enacp: an ensemble learning model for identification of anticancer peptides
publisher Frontiers Media S.A.
series Frontiers in Genetics
issn 1664-8021
publishDate 2020-07-01
description As cancer remains one of the main threats of human life, developing efficient cancer treatments is urgent. Anticancer peptides, which could overcome the significant side effects and poor results of traditional cancer treatments, have become a new potential alternative these years. However, identifying anticancer peptides by experimental methods is time consuming and resource consuming, it is of great significance to develop effective computational tools to quickly and accurately identify potential anticancer peptides from amino acid sequences. For most current computational methods, feature representation plays a key role in their final successes. This study proposes a novel fast and accurate approach to identify anticancer peptides using diversified feature representations and ensemble learning method. For the feature representations, the information is encoded from multidimensional feature spaces, including sequence composition, sequence-order, physicochemical properties, etc. In order to better model the potential relationships of peptides, multiple ensemble classifiers, LightGBMs, are applied to detect the different feature sets at first. Then the obtained multiple outputs are used as inputs of the support vector machine classifier, which effectively identifies anticancer peptides. Experimental results on cross validation and independent test sets demonstrate that our method can achieve better or comparable performances compared with other state-of-the-art methods.
topic anticancer peptides
feature representation
ensemble learning
pseudo amino acid composition
system biology
url https://www.frontiersin.org/article/10.3389/fgene.2020.00760/full
work_keys_str_mv AT ruiquange enacpanensemblelearningmodelforidentificationofanticancerpeptides
AT guanwenfeng enacpanensemblelearningmodelforidentificationofanticancerpeptides
AT xiaoyangjing enacpanensemblelearningmodelforidentificationofanticancerpeptides
AT renfengzhang enacpanensemblelearningmodelforidentificationofanticancerpeptides
AT puwang enacpanensemblelearningmodelforidentificationofanticancerpeptides
AT qingwu enacpanensemblelearningmodelforidentificationofanticancerpeptides
_version_ 1724669202443272192