Multiple Sequence Alignment Using the Clustering Method

碩士 === 國立中山大學 === 資訊工程學系研究所 === 89 === The multiple sequence alignment (MSA) is a fundamental technique of molecular biology. Biological sequences are aligned with each other vertically in order to show the similarities and differences among them. Due to its importance, many algorithms have been pro...

Full description

Bibliographic Details
Main Authors: Kuen-Feng Huang, 黃崑峰
Other Authors: Chang-Biau Yang
Format: Others
Language:en_US
Published: 2001
Online Access:http://ndltd.ncl.edu.tw/handle/49117729014706086846
Description
Summary:碩士 === 國立中山大學 === 資訊工程學系研究所 === 89 === The multiple sequence alignment (MSA) is a fundamental technique of molecular biology. Biological sequences are aligned with each other vertically in order to show the similarities and differences among them. Due to its importance, many algorithms have been proposed. With dynamic programming, finding the optimal alignment for a pair of sequences can be done in O(n2) time, where n is the length of the two strings. Unfortunately, for the general optimization problem of aligning k sequences of length n , O(nk) time is required. In this thesis, we shall first propose an efficient group alignment method to perform the alignment between two groups of sequences. Then we shall propose a clustering method to build the tree topology for merging. The clustering method is based on the concept that the two sequences having the longest distance should be split into two clusters. By our experiments, both the alignment quality and required time of our algorithm are better than those of NJ (neighbor joining) algorithm and Clustal W algorithm.