Improve Query Index Table of BLAST in GPU

碩士 === 國立彰化師範大學 === 資訊工程學系 === 107 ===   In order to respond to the explosive high-dimensional growth of biological information , the biological database and analytical tools for Bioinformatics research are crucial, and sequence alignment is a very important research field in Bioinformatics. In orde...

Full description

Bibliographic Details
Main Authors: Liu,Yan-Fang, 劉妍芳
Other Authors: Wu,Chao-Chin
Format: Others
Language:zh-TW
Published: 2019
Online Access:http://ndltd.ncl.edu.tw/handle/q3uc9z
Description
Summary:碩士 === 國立彰化師範大學 === 資訊工程學系 === 107 ===   In order to respond to the explosive high-dimensional growth of biological information , the biological database and analytical tools for Bioinformatics research are crucial, and sequence alignment is a very important research field in Bioinformatics. In order to improve the shortcomings of the old algorithm, the National Center for Biotechnology Information (NCBI) provides a processing tool, called BLAST, to respond to the rapid search mechanism for DNA or protein sequence alignment. In recent years, with the rapid advancement of GPU CUDA parallel processing, some scholars have proposed to improve the performance of BLAST by GPU. The algorithm is named CUDA-BLASTP, mainly using Deterministic Finite-state Automaton (DFA). The way to store data improves the alignment process of BLAST's Seed Generation stage. Therefore, this paper analyzes four stages of CUDA-BLASTP, and the stage 1 occupied most of the overall execution time when the stage 1 performs table lookup comparison. For the reason, we propose a new Query index table data structure. The data structure replaces the old mathod of DFA to improve the performance of CUDA-BLASTP. In the final test stage, we specifically used Query with a length less than 1,024 to compare with the CUDA-BLASTP algorithm, and tested the Query with a length less than 128, compared with Chen Zhaoyou et al. Experiments have shown that when performing BLAST sequence alignment, if we do not consider the memory size and the Query length is less than 1,024, everyone can use the index table we designed.