Summary: | 碩士 === 國立臺灣大學 === 資訊網路與多媒體研究所 === 101 === Mass spectrometry is the most versatile and widely used analytical method today. In a study called Scaffold Hunter (2011), Yeu-Chern Harn et al. proposed a method to output candidates for each peak of the mass spectrum by exploiting the NPSDB, a database of 82,242 scaffolds, with side chain likeliness data. The core combinatorial problem of this study was proved to be NP-complete by Bohan-Su et al. (2012) who introduced a dynamic programming and an iterative dynamic programming algorithm able to solve the problem in average pseudo polynomial time and average polynomial time respectively. Although the methods proposed by both Yeu-Chern Harn and Su-Bohan seem very promising, they lack three major elements: First, a stricter mathematical framework to bridge Yeu-Chern Harn and Su-Bohan’s works. Second, a framework for testing, evaluating and tuning the prediction method and third, efficient and predictable running times.
Therefore, in this study we first develop a mathematical framework to represent all important concepts. We then introduce FFT, a Framework for Fast Tuning of the prediction method; and evaluate the total running time for processing all 82,242 scaffolds of Yeu-Chern Harn’s Natural Product Scaffold database, using MapReduce on a home made 5 nodes Hadoop cluster. Finally we study the different improvements that can be made to Su-Bohan’s dynamic programming algorithms using General purpose GPU programming, and evaluate the running time of GAME, a CUDA based Gpu Accelerated Mixture Elucidator.
|