Summary: | 碩士 === 大同大學 === 應用數學學系(所) === 92 === DNA molecules have been proved be the generic material, and their properties are
determined by the order of four kinds of bases: A , C , G , and T . Hence DNA
sequencing has become one of important topics in the computational molecular biology.
In DNA sequencing, the occurrence of repeats will complicate DNA sequencing and may
prevent from the unique reconstruction. Moreover, the probability of DNA sequencing
depends on the patterns of DNA repeats. In this thesis, we study the relationship
between the patterns of DNA repeats and the probability of DNA sequencing.
After sequencing by hybridization, a simple set, called spectrum, of all fixed-
length subsequences in target DNA is obtained. Based on the spectrum, we construct a
reduced digraph where each vertex represents a distinct repeat. Then each Euler circuit
in the reduced digraph may result in a possible reconstruction. Hence the probability of
DNA sequencing can be obtained by evaluating the number of Euler circuits. On the
other hand, we introduce pattern graphs that are easy to present the patterns of DNA
repeats. Based on the combinatorial concepts, we characterize the patterns of DNA
repeats of k possible reconstructions for some specific k s. Moreover, we enumerate
the patterns of DNA repeats that have k possible sequencings, and find the
corresponding generating functions. Finally, we do some extended studies and present
related results for specific repetitive patterns.
|