Probabilistic Graphical Models and Algorithms for
In this thesis I present research in two fields: machine learning and computational biology. First, I develop new machine learning methods for graphical models that can be applied to protein problems. Then I apply graphical model algorithms to protein problems, obtaining improvements in protein st...
Main Author: | |
---|---|
Language: | en |
Published: |
2008
|
Subjects: | |
Online Access: | http://hdl.handle.net/10012/3773 |
id |
ndltd-WATERLOO-oai-uwspace.uwaterloo.ca-10012-3773 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-WATERLOO-oai-uwspace.uwaterloo.ca-10012-37732013-01-08T18:51:16ZJiao, Feng2008-05-26T16:24:08Z2008-05-26T16:24:08Z2008-05-26T16:24:08Z2008http://hdl.handle.net/10012/3773In this thesis I present research in two fields: machine learning and computational biology. First, I develop new machine learning methods for graphical models that can be applied to protein problems. Then I apply graphical model algorithms to protein problems, obtaining improvements in protein structure prediction and protein structure alignment. First,in the machine learning work, I focus on a special kind of graphical model---conditional random fields (CRFs). Here, I present a new semi-supervised training procedure for CRFs that can be used to train sequence segmentors and labellers from a combination of labeled and unlabeled training data. Such learning algorithms can be applied to protein and gene name entity recognition problems. This work provides one of the first semi-supervised discriminative training methods for structured classification. Second, in my computational biology work, I focus mainly on protein problems. In particular, I first propose a tree decomposition method for solving the protein structure prediction and protein structure alignment problems. In so doing, I reveal why tree decomposition is a good method for many protein problems. Then, I propose a computational framework for detection of similar structures of a target protein with sparse NMR data, which can help to predict protein structure using experimental data. Finally, I propose a new machine learning approach---LS_Boost---to solve the protein fold recognition problem, which is one of the key steps in protein structure prediction. After a thorough comparison, the algorithm is proved to be both more accurate and more efficient than traditional z-Score method and other machine learning methods.enmachine learningcomputational biologyProbabilistic Graphical Models and Algorithms forThesis or DissertationSchool of Computer ScienceDoctor of PhilosophyComputer Science |
collection |
NDLTD |
language |
en |
sources |
NDLTD |
topic |
machine learning computational biology Computer Science |
spellingShingle |
machine learning computational biology Computer Science Jiao, Feng Probabilistic Graphical Models and Algorithms for |
description |
In this thesis I present research in two fields: machine learning and computational biology.
First, I develop new machine learning methods for graphical models that can be applied to protein problems. Then I apply graphical model algorithms to protein problems, obtaining improvements in protein structure prediction and protein structure alignment. First,in the machine learning work, I focus on a special kind of graphical model---conditional random fields (CRFs). Here, I present a new semi-supervised training procedure for CRFs that can be used to train sequence segmentors and labellers from a combination of labeled and unlabeled training data. Such learning algorithms can be applied to protein and gene name entity recognition problems. This work provides one of the first semi-supervised discriminative training methods for structured classification.
Second, in my computational biology work, I focus mainly on protein problems. In particular, I first propose a tree decomposition method for solving the protein structure prediction and protein structure alignment problems. In so doing, I reveal why tree decomposition is a good method for many protein problems. Then, I propose a computational framework for detection of similar structures of a target protein with sparse NMR data, which can help to predict protein structure using experimental data.
Finally, I propose a new machine learning approach---LS_Boost---to solve the protein fold recognition problem, which is one of the key steps in protein structure prediction. After a thorough comparison, the algorithm is proved to be both more accurate and more efficient than traditional z-Score method and other machine learning methods. |
author |
Jiao, Feng |
author_facet |
Jiao, Feng |
author_sort |
Jiao, Feng |
title |
Probabilistic Graphical Models and Algorithms for |
title_short |
Probabilistic Graphical Models and Algorithms for |
title_full |
Probabilistic Graphical Models and Algorithms for |
title_fullStr |
Probabilistic Graphical Models and Algorithms for |
title_full_unstemmed |
Probabilistic Graphical Models and Algorithms for |
title_sort |
probabilistic graphical models and algorithms for |
publishDate |
2008 |
url |
http://hdl.handle.net/10012/3773 |
work_keys_str_mv |
AT jiaofeng probabilisticgraphicalmodelsandalgorithmsfor |
_version_ |
1716573138631786496 |