Summary: | 碩士 === 國立虎尾科技大學 === 資訊工程研究所 === 103 === Palindromes are strings of symbols that read the same in the forward and backward directions, which have some biological consequences, especially in sequence analysis. Palindromes appear frequently and are widespread in human cancers, and identify them could help advancethe understanding of genomic instability. Therefore, the palindrome detection problem is an importantissue in computational biology.
In this thesis, we proposed an effective algorithm based upon dynamic programming strategy to find all approximate palindromes up to Kerrors (K is specified by the user). We applied this algorithm to extract the approximate palindromes in the RNA sequences, which consist of 257 fusion gene sequences from NCBI, 1881 human’s microRNAs from miRbase and 154 long non-coding RNA(lncRNA). Many analyses were performed, including the longest palindromes, the longest palindromes under the error bound, the ratio of palindrome length to the sequence length and the corresponding ratio of the four base pairs etc. A web platform has been set up to display the results and provide an on-line service for identifying palindromes; the web-based platform can be accessed at http://bioinfo.csie.nfu.edu.tw/palindrome/index.html .
|