A KNN Model Based on Manhattan Distance to Identify the SNARE Proteins

SNARE proteins, known as membrane fusion proteins, play a primary role to mediate vesicle fusion. Loss of function of the SNARE protein can lead to a variety of diseases. A method to accurately identify the SNARE protein is important and necessary. In this paper, we try different kinds of combinatio...

Full description

Bibliographic Details
Main Authors: Xing Gao, Guilin Li
Format: Article
Language:English
Published: IEEE 2020-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9119343/
Description
Summary:SNARE proteins, known as membrane fusion proteins, play a primary role to mediate vesicle fusion. Loss of function of the SNARE protein can lead to a variety of diseases. A method to accurately identify the SNARE protein is important and necessary. In this paper, we try different kinds of combinations of sampling methods (the resampling, SMOTE and no sampling), feature extraction approaches (the 188D, K-skip-2-gram and CKSAAP) and distance measurements (Chebyshev distance, Euclidean distance, Manhattan distance and Minkowski distance) to find a suitable model for identifying the SNARE proteins. By doing extensive experiments, we construct a Manhattan distance based KNN model by combining the CKSAAP feature extraction approach with no sampling method, which achieves the best identification performance among all combinations. Finally, we compare our KNN based model with a deep learning based model (called SNARE-CNN) from SN, SP, ACC and MCC four aspects, the experimental results show that the performance of our model is better than that of the SNARE-CNN.
ISSN:2169-3536