Predicting Human and Animal Protein Subcellular Location

Bibliographic Details
Main Author: Khavari, Sepideh
Language:English
Published: Youngstown State University / OhioLINK 2016
Subjects:
Online Access:http://rave.ohiolink.edu/etdc/view?acc_num=ysu1472463855
id ndltd-OhioLink-oai-etd.ohiolink.edu-ysu1472463855
record_format oai_dc
spelling ndltd-OhioLink-oai-etd.ohiolink.edu-ysu14724638552021-08-03T06:38:29Z Predicting Human and Animal Protein Subcellular Location Khavari, Sepideh Biology Statistics Protein Subcellular location Computational predictors An important objective in cell biology is to determine the subcellular location of different proteins and their functions in the cell. Identifying the subcellular location of proteins can be accomplished either by using biochemical experiments or by developing computational predictors that aid in predicting the subcellular location of proteins. Since the former method is both time-consuming and expensive, the computational predictors provide a more advantageous and efficient method ofsolving the problem. Computational predictors are also ideal in solving the problem of predicting protein subcellular locations since the number of newly discovered proteins have been increasing tremendously as a result of the genome sequencing project.The main objective of this study is to use several different classifiers to predict the subcellular location of animal and human proteins and to determine which of these classifiers performs the best in predicting protein subcellular location. The data for this study was obtained from The Universal Protein Resource (UniProt) which is a database of protein sequence and annotation. Therefore, by accessing UniProt Knowledgebase (UniProt KB), the human and animal proteins that were manuallyreviewed and annotated (Swiss-Prot) were chosen for this study.A reliable benchmark dataset is obtained by following and applying criteria established in earlier studies for predicting protein subcellular locations. After applying the above criteria to the original dataset, the working benchmark dataset includes 2944 protein sequences. The subcellular locations of these proteins are the nucleus (1001 proteins), the cytoplasm (540 proteins), the secreted (436 proteins), the mitochondria (328 proteins), the cell membrane (286 proteins), the endoplasmicreticulum (207 proteins), the Golgi apparatus (86 proteins), the peroxisome (30 proteins), and the lysosome (30 proteins). Therefore, there are 9 different subcellular locations for proteins in this dataset.The method used for representing proteins in the study is the pseudo-amino acid composition (PseAA composition) adapted from earlier studies. The predictors used to predict the subcellular location of proteins in animal and human include Random Forest, Adaptive Boosting (AdaBoost), and Stage-wise Additive Modeling using a Multi-class Exponential loss function (SAMME), Support Vector Machines (SVMs), and Artificial Neural Networks (ANNs).The results from this study establish that the SVMs classifier yielded the best overall accuracy for predicting the subcellular location of proteins. Most of the computational classifiers used in this study produced better prediction results for determining the subcellular location of proteins in the nucleus, the secreted, and the cell membrane. The secreted and the cell membrane locations had high specificity values with all of the classifiers used in this study. The nucleus had the best prediction results, including a high sensitivity and a high MCC value by using the Bagging method. 2016-08-31 English text Youngstown State University / OhioLINK http://rave.ohiolink.edu/etdc/view?acc_num=ysu1472463855 http://rave.ohiolink.edu/etdc/view?acc_num=ysu1472463855 unrestricted This thesis or dissertation is protected by copyright: all rights reserved. It may not be copied or redistributed beyond the terms of applicable copyright laws.
collection NDLTD
language English
sources NDLTD
topic Biology
Statistics
Protein
Subcellular location
Computational predictors
spellingShingle Biology
Statistics
Protein
Subcellular location
Computational predictors
Khavari, Sepideh
Predicting Human and Animal Protein Subcellular Location
author Khavari, Sepideh
author_facet Khavari, Sepideh
author_sort Khavari, Sepideh
title Predicting Human and Animal Protein Subcellular Location
title_short Predicting Human and Animal Protein Subcellular Location
title_full Predicting Human and Animal Protein Subcellular Location
title_fullStr Predicting Human and Animal Protein Subcellular Location
title_full_unstemmed Predicting Human and Animal Protein Subcellular Location
title_sort predicting human and animal protein subcellular location
publisher Youngstown State University / OhioLINK
publishDate 2016
url http://rave.ohiolink.edu/etdc/view?acc_num=ysu1472463855
work_keys_str_mv AT khavarisepideh predictinghumanandanimalproteinsubcellularlocation
_version_ 1719440757017804800