Genome Graphs
Whole-genome shotgun sequencing is an experimental technique used for obtaining information about a genome’s sequence, whereby it is broken up into many short (possibly overlapping) segments whose sequence is then determined. A long-standing use of sequencing is in genome assembly – the problem of d...
Main Author: | |
---|---|
Other Authors: | |
Language: | en_ca |
Published: |
2010
|
Subjects: | |
Online Access: | http://hdl.handle.net/1807/26297 |
id |
ndltd-TORONTO-oai-tspace.library.utoronto.ca-1807-26297 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TORONTO-oai-tspace.library.utoronto.ca-1807-262972013-04-19T19:55:10ZGenome GraphsMedvedev, Paulbioinforamatics0984Whole-genome shotgun sequencing is an experimental technique used for obtaining information about a genome’s sequence, whereby it is broken up into many short (possibly overlapping) segments whose sequence is then determined. A long-standing use of sequencing is in genome assembly – the problem of determining the sequence of an unknown genome, which plays a central role for the sequencing of novel species. However, even within the same species, the genomes of two individuals differ, and though these variations are relatively small, they account for the observed variation in phenotypes. A large portion of these are copy number variants (CNVs), or genomic segments which appear a different number of times in different individuals. The unifying theme of this thesis is the use of genome graphs for both CNV detection and genome assembly problems. Genome graphs, which have already been successfully used for alignment and assembly, capture the structure of a genome even when its sequence is not fully known, as with the case of sequencing data. In this thesis, we extend their uses in several ways, culminating in a method for CNV detection that is based on a novel genome graph model. First, we demonstrate how the double-stranded nature of DNA can be efficiently incorporated into genome graphs by using the technique of bidirected network flow. Furthermore, we show how genome graphs can be efficiently used for finding solutions that maximize the likelihood of the data, as opposed to the usual maximum parsimony approach. Finally, we show how genome graphs can be useful for CNV detection through a novel construction called the donor graph. These extensions are combined into a method for detecting CNVs, which we use on a Yoruban human individual, showing a high degree of accuracy and improvement over previous methods.Brudno, MichaelBorodin, Allan2010-112011-02-18T16:12:35ZNO_RESTRICTION2011-02-18T16:12:35Z2011-02-18T16:12:35ZThesishttp://hdl.handle.net/1807/26297en_ca |
collection |
NDLTD |
language |
en_ca |
sources |
NDLTD |
topic |
bioinforamatics 0984 |
spellingShingle |
bioinforamatics 0984 Medvedev, Paul Genome Graphs |
description |
Whole-genome shotgun sequencing is an experimental technique used for obtaining information about a genome’s sequence, whereby it is broken up into many short (possibly overlapping) segments whose sequence is then determined. A long-standing use of sequencing is in genome assembly – the problem of determining the sequence of an unknown genome, which plays a central role for the sequencing of novel species. However, even within the same species, the genomes of two individuals differ, and though these variations are relatively small, they account for the observed variation in phenotypes. A large portion of these are copy number variants (CNVs), or genomic segments which appear a different number of times in different individuals.
The unifying theme of this thesis is the use of genome graphs for both CNV detection and genome assembly problems. Genome graphs, which have already been successfully used for alignment and assembly, capture the structure of a genome even when its sequence is not fully known, as with the case of sequencing data. In this thesis, we extend
their uses in several ways, culminating in a method for CNV detection that is based on a novel genome graph model. First, we demonstrate how the double-stranded nature of
DNA can be efficiently incorporated into genome graphs by using the technique of bidirected network flow. Furthermore, we show how genome graphs can be efficiently used for finding solutions that maximize the likelihood of the data, as opposed to the usual maximum parsimony approach. Finally, we show how genome graphs can be useful for
CNV detection through a novel construction called the donor graph. These extensions are combined into a method for detecting CNVs, which we use on a Yoruban human individual, showing a high degree of accuracy and improvement over previous methods. |
author2 |
Brudno, Michael |
author_facet |
Brudno, Michael Medvedev, Paul |
author |
Medvedev, Paul |
author_sort |
Medvedev, Paul |
title |
Genome Graphs |
title_short |
Genome Graphs |
title_full |
Genome Graphs |
title_fullStr |
Genome Graphs |
title_full_unstemmed |
Genome Graphs |
title_sort |
genome graphs |
publishDate |
2010 |
url |
http://hdl.handle.net/1807/26297 |
work_keys_str_mv |
AT medvedevpaul genomegraphs |
_version_ |
1716581776078405633 |