Machine-learning methods for structure prediction of β-barrel membrane proteins

Different types of proteins exist with diverse functions that are essential for living organisms. An important class of proteins is represented by transmembrane proteins which are specifically designed to be inserted into biological membranes and devised to perform very important functions in the ce...

Full description

Bibliographic Details
Main Author: Savojardo, Castrense <1981>
Other Authors: Fariselli, Piero
Format: Doctoral Thesis
Language:en
Published: Alma Mater Studiorum - Università di Bologna 2013
Subjects:
Online Access:http://amsdottorato.unibo.it/5429/
id ndltd-unibo.it-oai-amsdottorato.cib.unibo.it-5429
record_format oai_dc
spelling ndltd-unibo.it-oai-amsdottorato.cib.unibo.it-54292014-03-24T16:30:25Z Machine-learning methods for structure prediction of β-barrel membrane proteins Savojardo, Castrense <1981> INF/01 Informatica Different types of proteins exist with diverse functions that are essential for living organisms. An important class of proteins is represented by transmembrane proteins which are specifically designed to be inserted into biological membranes and devised to perform very important functions in the cell such as cell communication and active transport across the membrane. Transmembrane β-barrels (TMBBs) are a sub-class of membrane proteins largely under-represented in structure databases because of the extreme difficulty in experimental structure determination. For this reason, computational tools that are able to predict the structure of TMBBs are needed. In this thesis, two computational problems related to TMBBs were addressed: the detection of TMBBs in large datasets of proteins and the prediction of the topology of TMBB proteins. Firstly, a method for TMBB detection was presented based on a novel neural network framework for variable-length sequence classification. The proposed approach was validated on a non-redundant dataset of proteins. Furthermore, we carried-out genome-wide detection using the entire Escherichia coli proteome. In both experiments, the method significantly outperformed other existing state-of-the-art approaches, reaching very high PPV (92%) and MCC (0.82). Secondly, a method was also introduced for TMBB topology prediction. The proposed approach is based on grammatical modelling and probabilistic discriminative models for sequence data labeling. The method was evaluated using a newly generated dataset of 38 TMBB proteins obtained from high-resolution data in the PDB. Results have shown that the model is able to correctly predict topologies of 25 out of 38 protein chains in the dataset. When tested on previously released datasets, the performances of the proposed approach were measured as comparable or superior to the current state-of-the-art of TMBB topology prediction. Alma Mater Studiorum - Università di Bologna Fariselli, Piero 2013-04-08 Doctoral Thesis PeerReviewed application/pdf en http://amsdottorato.unibo.it/5429/ info:eu-repo/semantics/openAccess
collection NDLTD
language en
format Doctoral Thesis
sources NDLTD
topic INF/01 Informatica
spellingShingle INF/01 Informatica
Savojardo, Castrense <1981>
Machine-learning methods for structure prediction of β-barrel membrane proteins
description Different types of proteins exist with diverse functions that are essential for living organisms. An important class of proteins is represented by transmembrane proteins which are specifically designed to be inserted into biological membranes and devised to perform very important functions in the cell such as cell communication and active transport across the membrane. Transmembrane β-barrels (TMBBs) are a sub-class of membrane proteins largely under-represented in structure databases because of the extreme difficulty in experimental structure determination. For this reason, computational tools that are able to predict the structure of TMBBs are needed. In this thesis, two computational problems related to TMBBs were addressed: the detection of TMBBs in large datasets of proteins and the prediction of the topology of TMBB proteins. Firstly, a method for TMBB detection was presented based on a novel neural network framework for variable-length sequence classification. The proposed approach was validated on a non-redundant dataset of proteins. Furthermore, we carried-out genome-wide detection using the entire Escherichia coli proteome. In both experiments, the method significantly outperformed other existing state-of-the-art approaches, reaching very high PPV (92%) and MCC (0.82). Secondly, a method was also introduced for TMBB topology prediction. The proposed approach is based on grammatical modelling and probabilistic discriminative models for sequence data labeling. The method was evaluated using a newly generated dataset of 38 TMBB proteins obtained from high-resolution data in the PDB. Results have shown that the model is able to correctly predict topologies of 25 out of 38 protein chains in the dataset. When tested on previously released datasets, the performances of the proposed approach were measured as comparable or superior to the current state-of-the-art of TMBB topology prediction.
author2 Fariselli, Piero
author_facet Fariselli, Piero
Savojardo, Castrense <1981>
author Savojardo, Castrense <1981>
author_sort Savojardo, Castrense <1981>
title Machine-learning methods for structure prediction of β-barrel membrane proteins
title_short Machine-learning methods for structure prediction of β-barrel membrane proteins
title_full Machine-learning methods for structure prediction of β-barrel membrane proteins
title_fullStr Machine-learning methods for structure prediction of β-barrel membrane proteins
title_full_unstemmed Machine-learning methods for structure prediction of β-barrel membrane proteins
title_sort machine-learning methods for structure prediction of β-barrel membrane proteins
publisher Alma Mater Studiorum - Università di Bologna
publishDate 2013
url http://amsdottorato.unibo.it/5429/
work_keys_str_mv AT savojardocastrense1981 machinelearningmethodsforstructurepredictionofbbarrelmembraneproteins
_version_ 1716654602281025536