Predicting Human and Animal Protein Subcellular Location
Main Author: | |
---|---|
Language: | English |
Published: |
Youngstown State University / OhioLINK
2016
|
Subjects: | |
Online Access: | http://rave.ohiolink.edu/etdc/view?acc_num=ysu1472463855 |
id |
ndltd-OhioLink-oai-etd.ohiolink.edu-ysu1472463855 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-OhioLink-oai-etd.ohiolink.edu-ysu14724638552021-08-03T06:38:29Z Predicting Human and Animal Protein Subcellular Location Khavari, Sepideh Biology Statistics Protein Subcellular location Computational predictors An important objective in cell biology is to determine the subcellular location of different proteins and their functions in the cell. Identifying the subcellular location of proteins can be accomplished either by using biochemical experiments or by developing computational predictors that aid in predicting the subcellular location of proteins. Since the former method is both time-consuming and expensive, the computational predictors provide a more advantageous and efficient method ofsolving the problem. Computational predictors are also ideal in solving the problem of predicting protein subcellular locations since the number of newly discovered proteins have been increasing tremendously as a result of the genome sequencing project.The main objective of this study is to use several different classifiers to predict the subcellular location of animal and human proteins and to determine which of these classifiers performs the best in predicting protein subcellular location. The data for this study was obtained from The Universal Protein Resource (UniProt) which is a database of protein sequence and annotation. Therefore, by accessing UniProt Knowledgebase (UniProt KB), the human and animal proteins that were manuallyreviewed and annotated (Swiss-Prot) were chosen for this study.A reliable benchmark dataset is obtained by following and applying criteria established in earlier studies for predicting protein subcellular locations. After applying the above criteria to the original dataset, the working benchmark dataset includes 2944 protein sequences. The subcellular locations of these proteins are the nucleus (1001 proteins), the cytoplasm (540 proteins), the secreted (436 proteins), the mitochondria (328 proteins), the cell membrane (286 proteins), the endoplasmicreticulum (207 proteins), the Golgi apparatus (86 proteins), the peroxisome (30 proteins), and the lysosome (30 proteins). Therefore, there are 9 different subcellular locations for proteins in this dataset.The method used for representing proteins in the study is the pseudo-amino acid composition (PseAA composition) adapted from earlier studies. The predictors used to predict the subcellular location of proteins in animal and human include Random Forest, Adaptive Boosting (AdaBoost), and Stage-wise Additive Modeling using a Multi-class Exponential loss function (SAMME), Support Vector Machines (SVMs), and Artificial Neural Networks (ANNs).The results from this study establish that the SVMs classifier yielded the best overall accuracy for predicting the subcellular location of proteins. Most of the computational classifiers used in this study produced better prediction results for determining the subcellular location of proteins in the nucleus, the secreted, and the cell membrane. The secreted and the cell membrane locations had high specificity values with all of the classifiers used in this study. The nucleus had the best prediction results, including a high sensitivity and a high MCC value by using the Bagging method. 2016-08-31 English text Youngstown State University / OhioLINK http://rave.ohiolink.edu/etdc/view?acc_num=ysu1472463855 http://rave.ohiolink.edu/etdc/view?acc_num=ysu1472463855 unrestricted This thesis or dissertation is protected by copyright: all rights reserved. It may not be copied or redistributed beyond the terms of applicable copyright laws. |
collection |
NDLTD |
language |
English |
sources |
NDLTD |
topic |
Biology Statistics Protein Subcellular location Computational predictors |
spellingShingle |
Biology Statistics Protein Subcellular location Computational predictors Khavari, Sepideh Predicting Human and Animal Protein Subcellular Location |
author |
Khavari, Sepideh |
author_facet |
Khavari, Sepideh |
author_sort |
Khavari, Sepideh |
title |
Predicting Human and Animal Protein Subcellular Location |
title_short |
Predicting Human and Animal Protein Subcellular Location |
title_full |
Predicting Human and Animal Protein Subcellular Location |
title_fullStr |
Predicting Human and Animal Protein Subcellular Location |
title_full_unstemmed |
Predicting Human and Animal Protein Subcellular Location |
title_sort |
predicting human and animal protein subcellular location |
publisher |
Youngstown State University / OhioLINK |
publishDate |
2016 |
url |
http://rave.ohiolink.edu/etdc/view?acc_num=ysu1472463855 |
work_keys_str_mv |
AT khavarisepideh predictinghumanandanimalproteinsubcellularlocation |
_version_ |
1719440757017804800 |