Signal processing for DNA sequencing
Thesis (M.Eng. and S.B.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2002. === Includes bibliographical references (p. 83-86). === DNA sequencing is the process of determining the sequence of chemical bases in a particular DNA molecule-nature's...
Main Author: | |
---|---|
Other Authors: | |
Format: | Others |
Language: | English |
Published: |
Massachusetts Institute of Technology
2005
|
Subjects: | |
Online Access: | http://hdl.handle.net/1721.1/17536 |
id |
ndltd-MIT-oai-dspace.mit.edu-1721.1-17536 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-MIT-oai-dspace.mit.edu-1721.1-175362019-05-02T16:19:48Z Signal processing for DNA sequencing Signal processing for Deoxyribonucleic acid sequencing Boufounos, Petros T., 1977- Alan V. Oppenheim. Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science. Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science. Electrical Engineering and Computer Science. Nucleotide sequence Thesis (M.Eng. and S.B.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2002. Includes bibliographical references (p. 83-86). DNA sequencing is the process of determining the sequence of chemical bases in a particular DNA molecule-nature's blueprint of how life works. The advancement of biological science in has created a vast demand for sequencing methods, which needs to be addressed by automated equipment. This thesis tries to address one part of that process, known as base calling: it is the conversion of the electrical signal-the electropherogram--collected by the sequencing equipment to a sequence of letters drawn from ( A,TC,G ) that corresponds to the sequence in the molecule sequenced. This work formulates the problem as a pattern recognition problem, and observes its striking resemblance to the speech recognition problem. We, therefore, propose combining Hidden Markov Models and Artificial Neural Networks to solve it. In the formulation we derive an algorithm for training both models together. Furthermore, we devise a method to create very accurate training data, requiring minimal hand-labeling. We compare our method with the de facto standard, PHRED, and produce comparable results. Finally, we propose alternative HMM topologies that have the potential to significantly improve the performance of the method. by Petros T. Boufounos. M.Eng.and S.B. 2005-06-02T16:05:25Z 2005-06-02T16:05:25Z 2002 2002 Thesis http://hdl.handle.net/1721.1/17536 51111486 eng M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582 86 p. 3809782 bytes 3809587 bytes application/pdf application/pdf application/pdf Massachusetts Institute of Technology |
collection |
NDLTD |
language |
English |
format |
Others
|
sources |
NDLTD |
topic |
Electrical Engineering and Computer Science. Nucleotide sequence |
spellingShingle |
Electrical Engineering and Computer Science. Nucleotide sequence Boufounos, Petros T., 1977- Signal processing for DNA sequencing |
description |
Thesis (M.Eng. and S.B.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2002. === Includes bibliographical references (p. 83-86). === DNA sequencing is the process of determining the sequence of chemical bases in a particular DNA molecule-nature's blueprint of how life works. The advancement of biological science in has created a vast demand for sequencing methods, which needs to be addressed by automated equipment. This thesis tries to address one part of that process, known as base calling: it is the conversion of the electrical signal-the electropherogram--collected by the sequencing equipment to a sequence of letters drawn from ( A,TC,G ) that corresponds to the sequence in the molecule sequenced. This work formulates the problem as a pattern recognition problem, and observes its striking resemblance to the speech recognition problem. We, therefore, propose combining Hidden Markov Models and Artificial Neural Networks to solve it. In the formulation we derive an algorithm for training both models together. Furthermore, we devise a method to create very accurate training data, requiring minimal hand-labeling. We compare our method with the de facto standard, PHRED, and produce comparable results. Finally, we propose alternative HMM topologies that have the potential to significantly improve the performance of the method. === by Petros T. Boufounos. === M.Eng.and S.B. |
author2 |
Alan V. Oppenheim. |
author_facet |
Alan V. Oppenheim. Boufounos, Petros T., 1977- |
author |
Boufounos, Petros T., 1977- |
author_sort |
Boufounos, Petros T., 1977- |
title |
Signal processing for DNA sequencing |
title_short |
Signal processing for DNA sequencing |
title_full |
Signal processing for DNA sequencing |
title_fullStr |
Signal processing for DNA sequencing |
title_full_unstemmed |
Signal processing for DNA sequencing |
title_sort |
signal processing for dna sequencing |
publisher |
Massachusetts Institute of Technology |
publishDate |
2005 |
url |
http://hdl.handle.net/1721.1/17536 |
work_keys_str_mv |
AT boufounospetrost1977 signalprocessingfordnasequencing AT boufounospetrost1977 signalprocessingfordeoxyribonucleicacidsequencing |
_version_ |
1719038317106823168 |