Probabilistic Graphical Models and Algorithms for

In this thesis I present research in two fields: machine learning and computational biology. First, I develop new machine learning methods for graphical models that can be applied to protein problems. Then I apply graphical model algorithms to protein problems, obtaining improvements in protein st...

Full description

Bibliographic Details
Main Author:	Jiao, Feng
Language:	en
Published:	2008
Subjects:	machine learning computational biology Computer Science
Online Access:	http://hdl.handle.net/10012/3773

id	ndltd-WATERLOO-oai-uwspace.uwaterloo.ca-10012-3773
record_format	oai_dc
spelling	ndltd-WATERLOO-oai-uwspace.uwaterloo.ca-10012-37732013-01-08T18:51:16ZJiao, Feng2008-05-26T16:24:08Z2008-05-26T16:24:08Z2008-05-26T16:24:08Z2008http://hdl.handle.net/10012/3773In this thesis I present research in two fields: machine learning and computational biology. First, I develop new machine learning methods for graphical models that can be applied to protein problems. Then I apply graphical model algorithms to protein problems, obtaining improvements in protein structure prediction and protein structure alignment. First,in the machine learning work, I focus on a special kind of graphical model---conditional random fields (CRFs). Here, I present a new semi-supervised training procedure for CRFs that can be used to train sequence segmentors and labellers from a combination of labeled and unlabeled training data. Such learning algorithms can be applied to protein and gene name entity recognition problems. This work provides one of the first semi-supervised discriminative training methods for structured classification. Second, in my computational biology work, I focus mainly on protein problems. In particular, I first propose a tree decomposition method for solving the protein structure prediction and protein structure alignment problems. In so doing, I reveal why tree decomposition is a good method for many protein problems. Then, I propose a computational framework for detection of similar structures of a target protein with sparse NMR data, which can help to predict protein structure using experimental data. Finally, I propose a new machine learning approach---LS_Boost---to solve the protein fold recognition problem, which is one of the key steps in protein structure prediction. After a thorough comparison, the algorithm is proved to be both more accurate and more efficient than traditional z-Score method and other machine learning methods.enmachine learningcomputational biologyProbabilistic Graphical Models and Algorithms forThesis or DissertationSchool of Computer ScienceDoctor of PhilosophyComputer Science
collection	NDLTD
language	en
sources	NDLTD
topic	machine learning computational biology Computer Science
spellingShingle	machine learning computational biology Computer Science Jiao, Feng Probabilistic Graphical Models and Algorithms for
description	In this thesis I present research in two fields: machine learning and computational biology. First, I develop new machine learning methods for graphical models that can be applied to protein problems. Then I apply graphical model algorithms to protein problems, obtaining improvements in protein structure prediction and protein structure alignment. First,in the machine learning work, I focus on a special kind of graphical model---conditional random fields (CRFs). Here, I present a new semi-supervised training procedure for CRFs that can be used to train sequence segmentors and labellers from a combination of labeled and unlabeled training data. Such learning algorithms can be applied to protein and gene name entity recognition problems. This work provides one of the first semi-supervised discriminative training methods for structured classification. Second, in my computational biology work, I focus mainly on protein problems. In particular, I first propose a tree decomposition method for solving the protein structure prediction and protein structure alignment problems. In so doing, I reveal why tree decomposition is a good method for many protein problems. Then, I propose a computational framework for detection of similar structures of a target protein with sparse NMR data, which can help to predict protein structure using experimental data. Finally, I propose a new machine learning approach---LS_Boost---to solve the protein fold recognition problem, which is one of the key steps in protein structure prediction. After a thorough comparison, the algorithm is proved to be both more accurate and more efficient than traditional z-Score method and other machine learning methods.
author	Jiao, Feng
author_facet	Jiao, Feng
author_sort	Jiao, Feng
title	Probabilistic Graphical Models and Algorithms for
title_short	Probabilistic Graphical Models and Algorithms for
title_full	Probabilistic Graphical Models and Algorithms for
title_fullStr	Probabilistic Graphical Models and Algorithms for
title_full_unstemmed	Probabilistic Graphical Models and Algorithms for
title_sort	probabilistic graphical models and algorithms for
publishDate	2008
url	http://hdl.handle.net/10012/3773
work_keys_str_mv	AT jiaofeng probabilisticgraphicalmodelsandalgorithmsfor
_version_	1716573138631786496

Probabilistic Graphical Models and Algorithms for

Similar Items