A Cytokine Protein Identification Model Based on the Compressed PseKRAAC Features

Cytokine proteins, which form a complex cytokine regulatory network, participate in a variety of important physiological functions of the human body. Identification of cytokine proteins is very important and has attracted the attention of many researchers. In this paper, we propose a MRMD-cosine mod...

Full description

Bibliographic Details
Main Authors: Xing Gao, Guilin Li
Format: Article
Language:English
Published: IEEE 2020-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9153552/
Description
Summary:Cytokine proteins, which form a complex cytokine regulatory network, participate in a variety of important physiological functions of the human body. Identification of cytokine proteins is very important and has attracted the attention of many researchers. In this paper, we propose a MRMD-cosine model based on the PseKRAAC features to identify the cytokine proteins. First, the PseKRAAC feature extraction method is used to extract four kinds of feature sets from the cytokine proteins, named type1 g-gap, type1 lambda, type2 g-gap and type2 lambda feature sets. Then the MRMD algorithm is used to remove the redundant features from the feature sets. Three kinds of metrics are used by the MRMD algorithm to measure the redundancy of a feature set, which are the Euclidean distance, Cosine similarity and Tanimoto coefficient. Bagging and random forest algorithms are used to construct the classification models based on the compressed feature set. The experimental results show that the MRMD-cosine model based on the type1 lambda feature set constructed by the random forest algorithm can achieve the best performance among all models. Finally, we compare the performance of the MRMD-cosine model with another state-of-art model, named greedy based feature compression model based on the CNT features. It shows that the MRMD-cosine model uses only 15% features of the greedy based model to achieve a better accuracy.
ISSN:2169-3536