Machine Learning for Variant Detection and Population Analysis in Heterogenerous Cancer Sample

Cancer is a complex and deadly disease that is caused by genetic lesions in somatic cells. Further research in computational methodology for detecting and characterizing somatic mutations is necessary in order to understand the comprehensive systems level model of the roles of those lesions in cance...

Full description

Bibliographic Details
Main Author: Jiao, Wei
Other Authors: Stein, Lincoln
Language:en_ca
Published: 2013
Subjects:
Online Access:http://hdl.handle.net/1807/42971
id ndltd-TORONTO-oai-tspace.library.utoronto.ca-1807-42971
record_format oai_dc
spelling ndltd-TORONTO-oai-tspace.library.utoronto.ca-1807-429712013-11-29T03:59:43ZMachine Learning for Variant Detection and Population Analysis in Heterogenerous Cancer SampleJiao, WeiSingle nucleotide variantMachine learningCancer heterogeneity0715Cancer is a complex and deadly disease that is caused by genetic lesions in somatic cells. Further research in computational methodology for detecting and characterizing somatic mutations is necessary in order to understand the comprehensive systems level model of the roles of those lesions in cancer development. In the first project, I trained a list of supervised machine learning classifiers that classify false positive versus true positive somatic single nucleotide variants (SNVs). I was able to show an improvement of somatic SNV detection on the data set over the reported classifier. In the second project, we developed PhyloSub model that uses a nonparametric Bayesian prior over a set of trees to cluster SNVs, and infer the subclonal phylogenetic structure of tumors with uncertainty from SNV sequencing data. Experiments showed that PhyloSub model could infer the subclonal phylogenetic structure from both single and multiple tumor samples.Stein, LincolnMorris, Quaid2013-112013-11-28T19:48:11ZNO_RESTRICTION2013-11-28T19:48:11Z2013-11-28Thesishttp://hdl.handle.net/1807/42971en_ca
collection NDLTD
language en_ca
sources NDLTD
topic Single nucleotide variant
Machine learning
Cancer heterogeneity
0715
spellingShingle Single nucleotide variant
Machine learning
Cancer heterogeneity
0715
Jiao, Wei
Machine Learning for Variant Detection and Population Analysis in Heterogenerous Cancer Sample
description Cancer is a complex and deadly disease that is caused by genetic lesions in somatic cells. Further research in computational methodology for detecting and characterizing somatic mutations is necessary in order to understand the comprehensive systems level model of the roles of those lesions in cancer development. In the first project, I trained a list of supervised machine learning classifiers that classify false positive versus true positive somatic single nucleotide variants (SNVs). I was able to show an improvement of somatic SNV detection on the data set over the reported classifier. In the second project, we developed PhyloSub model that uses a nonparametric Bayesian prior over a set of trees to cluster SNVs, and infer the subclonal phylogenetic structure of tumors with uncertainty from SNV sequencing data. Experiments showed that PhyloSub model could infer the subclonal phylogenetic structure from both single and multiple tumor samples.
author2 Stein, Lincoln
author_facet Stein, Lincoln
Jiao, Wei
author Jiao, Wei
author_sort Jiao, Wei
title Machine Learning for Variant Detection and Population Analysis in Heterogenerous Cancer Sample
title_short Machine Learning for Variant Detection and Population Analysis in Heterogenerous Cancer Sample
title_full Machine Learning for Variant Detection and Population Analysis in Heterogenerous Cancer Sample
title_fullStr Machine Learning for Variant Detection and Population Analysis in Heterogenerous Cancer Sample
title_full_unstemmed Machine Learning for Variant Detection and Population Analysis in Heterogenerous Cancer Sample
title_sort machine learning for variant detection and population analysis in heterogenerous cancer sample
publishDate 2013
url http://hdl.handle.net/1807/42971
work_keys_str_mv AT jiaowei machinelearningforvariantdetectionandpopulationanalysisinheterogenerouscancersample
_version_ 1716616232761819136