Transmembrane protein structure prediction using machine learning

This thesis describes the development and application of machine learning-based methods for the prediction of alpha-helical transmembrane protein structure from sequence alone. It is divided into six chapters. Chapter 1 provides an introduction to membrane structure and dynamics, membrane protein cl...

Full description

Bibliographic Details
Main Author: Nugent, T. C. O.
Published: University College London (University of London) 2010
Subjects:
570
Online Access:http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.565131
id ndltd-bl.uk-oai-ethos.bl.uk-565131
record_format oai_dc
spelling ndltd-bl.uk-oai-ethos.bl.uk-5651312015-12-03T03:29:38ZTransmembrane protein structure prediction using machine learningNugent, T. C. O.2010This thesis describes the development and application of machine learning-based methods for the prediction of alpha-helical transmembrane protein structure from sequence alone. It is divided into six chapters. Chapter 1 provides an introduction to membrane structure and dynamics, membrane protein classes and families, and membrane protein structure prediction. Chapter 2 describes a topological study of the transmembrane protein CLN3 using a consensus of bioinformatic approaches constrained by experimental data. Mutations in CLN3 can cause juvenile neuronal ceroid lipofuscinosis, or Batten disease, an inherited neurodegenerative lysosomal storage disease affecting children, therefore such studies are important for directing further experimental work into this incurable illness. Chapter 3 explores the possibility of using biologically meaningful signatures described as regular expressions to influence the assignment of inside and outside loop locations during transmembrane topology prediction. Using this approach, it was possilbe to modify a recent topology prediction method leading to an improvement of 6% prediction accuracy using a standard data set. Chapter 4 describes the development of a novel support vector machine-based topology predictor that integrates both signal peptide and re-entrant helix prediction, benchmarked with full cross-validation on a novel data set of sequences with known crystal structures. The method achieves state-of-the-art performance in predicting topology and discriminating between globular and transmembrane proteins. We also present the results of applying these tools to a number of complete genomes. Chapter 5 describes a novel approach to predict lipid exposure, residue contacts, helix-helix interactions and finally the optimal helical packing arrangement of transmembrane proteins. It is based on two support vector machine classifiers that predict per residue lipid exposure and residue contacts, which are used to determine helix-helix interaction with up to 65% accuracy. The method is also able to discriminate native from decoy helical packing arrangements with up to 70% accuracy. Finally, a force-directed algorithm is employed to construct the optimal helical packing arrangement which demonstrates success for proteins containing up to 13 transmembrane helices. The final chapter summarises the major contributions of this thesis to biology, before future perspectives for TM protein structure prediction are discussed.570University College London (University of London)http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.565131http://discovery.ucl.ac.uk/792008/Electronic Thesis or Dissertation
collection NDLTD
sources NDLTD
topic 570
spellingShingle 570
Nugent, T. C. O.
Transmembrane protein structure prediction using machine learning
description This thesis describes the development and application of machine learning-based methods for the prediction of alpha-helical transmembrane protein structure from sequence alone. It is divided into six chapters. Chapter 1 provides an introduction to membrane structure and dynamics, membrane protein classes and families, and membrane protein structure prediction. Chapter 2 describes a topological study of the transmembrane protein CLN3 using a consensus of bioinformatic approaches constrained by experimental data. Mutations in CLN3 can cause juvenile neuronal ceroid lipofuscinosis, or Batten disease, an inherited neurodegenerative lysosomal storage disease affecting children, therefore such studies are important for directing further experimental work into this incurable illness. Chapter 3 explores the possibility of using biologically meaningful signatures described as regular expressions to influence the assignment of inside and outside loop locations during transmembrane topology prediction. Using this approach, it was possilbe to modify a recent topology prediction method leading to an improvement of 6% prediction accuracy using a standard data set. Chapter 4 describes the development of a novel support vector machine-based topology predictor that integrates both signal peptide and re-entrant helix prediction, benchmarked with full cross-validation on a novel data set of sequences with known crystal structures. The method achieves state-of-the-art performance in predicting topology and discriminating between globular and transmembrane proteins. We also present the results of applying these tools to a number of complete genomes. Chapter 5 describes a novel approach to predict lipid exposure, residue contacts, helix-helix interactions and finally the optimal helical packing arrangement of transmembrane proteins. It is based on two support vector machine classifiers that predict per residue lipid exposure and residue contacts, which are used to determine helix-helix interaction with up to 65% accuracy. The method is also able to discriminate native from decoy helical packing arrangements with up to 70% accuracy. Finally, a force-directed algorithm is employed to construct the optimal helical packing arrangement which demonstrates success for proteins containing up to 13 transmembrane helices. The final chapter summarises the major contributions of this thesis to biology, before future perspectives for TM protein structure prediction are discussed.
author Nugent, T. C. O.
author_facet Nugent, T. C. O.
author_sort Nugent, T. C. O.
title Transmembrane protein structure prediction using machine learning
title_short Transmembrane protein structure prediction using machine learning
title_full Transmembrane protein structure prediction using machine learning
title_fullStr Transmembrane protein structure prediction using machine learning
title_full_unstemmed Transmembrane protein structure prediction using machine learning
title_sort transmembrane protein structure prediction using machine learning
publisher University College London (University of London)
publishDate 2010
url http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.565131
work_keys_str_mv AT nugenttco transmembraneproteinstructurepredictionusingmachinelearning
_version_ 1718141560450711552