Signal processing for DNA sequencing

Thesis (M.Eng. and S.B.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2002. === Includes bibliographical references (p. 83-86). === DNA sequencing is the process of determining the sequence of chemical bases in a particular DNA molecule-nature's...

Full description

Bibliographic Details
Main Author: Boufounos, Petros T., 1977-
Other Authors: Alan V. Oppenheim.
Format: Others
Language:English
Published: Massachusetts Institute of Technology 2005
Subjects:
Online Access:http://hdl.handle.net/1721.1/17536
id ndltd-MIT-oai-dspace.mit.edu-1721.1-17536
record_format oai_dc
spelling ndltd-MIT-oai-dspace.mit.edu-1721.1-175362019-05-02T16:19:48Z Signal processing for DNA sequencing Signal processing for Deoxyribonucleic acid sequencing Boufounos, Petros T., 1977- Alan V. Oppenheim. Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science. Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science. Electrical Engineering and Computer Science. Nucleotide sequence Thesis (M.Eng. and S.B.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2002. Includes bibliographical references (p. 83-86). DNA sequencing is the process of determining the sequence of chemical bases in a particular DNA molecule-nature's blueprint of how life works. The advancement of biological science in has created a vast demand for sequencing methods, which needs to be addressed by automated equipment. This thesis tries to address one part of that process, known as base calling: it is the conversion of the electrical signal-the electropherogram--collected by the sequencing equipment to a sequence of letters drawn from ( A,TC,G ) that corresponds to the sequence in the molecule sequenced. This work formulates the problem as a pattern recognition problem, and observes its striking resemblance to the speech recognition problem. We, therefore, propose combining Hidden Markov Models and Artificial Neural Networks to solve it. In the formulation we derive an algorithm for training both models together. Furthermore, we devise a method to create very accurate training data, requiring minimal hand-labeling. We compare our method with the de facto standard, PHRED, and produce comparable results. Finally, we propose alternative HMM topologies that have the potential to significantly improve the performance of the method. by Petros T. Boufounos. M.Eng.and S.B. 2005-06-02T16:05:25Z 2005-06-02T16:05:25Z 2002 2002 Thesis http://hdl.handle.net/1721.1/17536 51111486 eng M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582 86 p. 3809782 bytes 3809587 bytes application/pdf application/pdf application/pdf Massachusetts Institute of Technology
collection NDLTD
language English
format Others
sources NDLTD
topic Electrical Engineering and Computer Science.
Nucleotide sequence
spellingShingle Electrical Engineering and Computer Science.
Nucleotide sequence
Boufounos, Petros T., 1977-
Signal processing for DNA sequencing
description Thesis (M.Eng. and S.B.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2002. === Includes bibliographical references (p. 83-86). === DNA sequencing is the process of determining the sequence of chemical bases in a particular DNA molecule-nature's blueprint of how life works. The advancement of biological science in has created a vast demand for sequencing methods, which needs to be addressed by automated equipment. This thesis tries to address one part of that process, known as base calling: it is the conversion of the electrical signal-the electropherogram--collected by the sequencing equipment to a sequence of letters drawn from ( A,TC,G ) that corresponds to the sequence in the molecule sequenced. This work formulates the problem as a pattern recognition problem, and observes its striking resemblance to the speech recognition problem. We, therefore, propose combining Hidden Markov Models and Artificial Neural Networks to solve it. In the formulation we derive an algorithm for training both models together. Furthermore, we devise a method to create very accurate training data, requiring minimal hand-labeling. We compare our method with the de facto standard, PHRED, and produce comparable results. Finally, we propose alternative HMM topologies that have the potential to significantly improve the performance of the method. === by Petros T. Boufounos. === M.Eng.and S.B.
author2 Alan V. Oppenheim.
author_facet Alan V. Oppenheim.
Boufounos, Petros T., 1977-
author Boufounos, Petros T., 1977-
author_sort Boufounos, Petros T., 1977-
title Signal processing for DNA sequencing
title_short Signal processing for DNA sequencing
title_full Signal processing for DNA sequencing
title_fullStr Signal processing for DNA sequencing
title_full_unstemmed Signal processing for DNA sequencing
title_sort signal processing for dna sequencing
publisher Massachusetts Institute of Technology
publishDate 2005
url http://hdl.handle.net/1721.1/17536
work_keys_str_mv AT boufounospetrost1977 signalprocessingfordnasequencing
AT boufounospetrost1977 signalprocessingfordeoxyribonucleicacidsequencing
_version_ 1719038317106823168