Applying Sequential Pattern Mining Technique to Construct a Pairwise Sequence Similarity Kernel for Support Vector Machine Classifiers
碩士 === 元智大學 === 工業工程與管理學系 === 103 === Sequence classification problem can be found and discussed in many real world applications such as protein function prediction, text classification, and so on. SVMs (Support Vector Machines) have been used to deal with sequence classification problem, since SVMs...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Online Access: | http://ndltd.ncl.edu.tw/handle/75795224962225592217 |
id |
ndltd-TW-103YZU05031046 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-103YZU050310462016-12-04T04:07:59Z http://ndltd.ncl.edu.tw/handle/75795224962225592217 Applying Sequential Pattern Mining Technique to Construct a Pairwise Sequence Similarity Kernel for Support Vector Machine Classifiers 應用序列樣式探勘技術建構成對序列相似核方法於支援向量機分類器 Yu-Yu Yao 姚佑俞 碩士 元智大學 工業工程與管理學系 103 Sequence classification problem can be found and discussed in many real world applications such as protein function prediction, text classification, and so on. SVMs (Support Vector Machines) have been used to deal with sequence classification problem, since SVMs can deal with the nonlinear data and possess high efficiency in classification. However, the most difficult part in SVMs is to design an appropriate kernel function. Therefore, a pairwise sequence similarity kernel is proposed which takes sequential patterns instead of taking k-mers as reference sequences and evaluates the similarity scores between reference sequences and sequence data by a map function. To obtain sequential patterns, three different sequential pattern mining methods are used to extract frequent sequential patterns, frequent closed sequential patterns, and frequent maximal sequential patterns from sequence databases. The three sequential patterns are then evaluated to know which one could achieve higher accuracy. A map function, which is edit distance algorithm, is used in the proposed kernel to calculate the similarity score. Next, the sequence SVM classifier is built according to the proposed pairwise sequence similarity kernel. Through the proposed sequence SVM classifier with pairwise sequence similarity kernel, the class label of a new sequence will be predicted precisely. The artificial dataset and the real protein sequence dataset are employed to test the proposed SVM classification model using pairwise sequence similarity kernel with three different sequential patterns. The experiment results indicate the proposed SVM classification model using pairwise sequence similarity kernel is efficient and feasible. Chieh-Yuan Tsai 蔡介元 學位論文 ; thesis 83 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 元智大學 === 工業工程與管理學系 === 103 === Sequence classification problem can be found and discussed in many real world applications such as protein function prediction, text classification, and so on. SVMs (Support Vector Machines) have been used to deal with sequence classification problem, since SVMs can deal with the nonlinear data and possess high efficiency in classification. However, the most difficult part in SVMs is to design an appropriate kernel function. Therefore, a pairwise sequence similarity kernel is proposed which takes sequential patterns instead of taking k-mers as reference sequences and evaluates the similarity scores between reference sequences and sequence data by a map function. To obtain sequential patterns, three different sequential pattern mining methods are used to extract frequent sequential patterns, frequent closed sequential patterns, and frequent maximal sequential patterns from sequence databases. The three sequential patterns are then evaluated to know which one could achieve higher accuracy. A map function, which is edit distance algorithm, is used in the proposed kernel to calculate the similarity score. Next, the sequence SVM classifier is built according to the proposed pairwise sequence similarity kernel. Through the proposed sequence SVM classifier with pairwise sequence similarity kernel, the class label of a new sequence will be predicted precisely. The artificial dataset and the real protein sequence dataset are employed to test the proposed SVM classification model using pairwise sequence similarity kernel with three different sequential patterns. The experiment results indicate the proposed SVM classification model using pairwise sequence similarity kernel is efficient and feasible.
|
author2 |
Chieh-Yuan Tsai |
author_facet |
Chieh-Yuan Tsai Yu-Yu Yao 姚佑俞 |
author |
Yu-Yu Yao 姚佑俞 |
spellingShingle |
Yu-Yu Yao 姚佑俞 Applying Sequential Pattern Mining Technique to Construct a Pairwise Sequence Similarity Kernel for Support Vector Machine Classifiers |
author_sort |
Yu-Yu Yao |
title |
Applying Sequential Pattern Mining Technique to Construct a Pairwise Sequence Similarity Kernel for Support Vector Machine Classifiers |
title_short |
Applying Sequential Pattern Mining Technique to Construct a Pairwise Sequence Similarity Kernel for Support Vector Machine Classifiers |
title_full |
Applying Sequential Pattern Mining Technique to Construct a Pairwise Sequence Similarity Kernel for Support Vector Machine Classifiers |
title_fullStr |
Applying Sequential Pattern Mining Technique to Construct a Pairwise Sequence Similarity Kernel for Support Vector Machine Classifiers |
title_full_unstemmed |
Applying Sequential Pattern Mining Technique to Construct a Pairwise Sequence Similarity Kernel for Support Vector Machine Classifiers |
title_sort |
applying sequential pattern mining technique to construct a pairwise sequence similarity kernel for support vector machine classifiers |
url |
http://ndltd.ncl.edu.tw/handle/75795224962225592217 |
work_keys_str_mv |
AT yuyuyao applyingsequentialpatternminingtechniquetoconstructapairwisesequencesimilaritykernelforsupportvectormachineclassifiers AT yáoyòuyú applyingsequentialpatternminingtechniquetoconstructapairwisesequencesimilaritykernelforsupportvectormachineclassifiers AT yuyuyao yīngyòngxùlièyàngshìtànkānjìshùjiàngòuchéngduìxùlièxiāngshìhéfāngfǎyúzhīyuánxiàngliàngjīfēnlèiqì AT yáoyòuyú yīngyòngxùlièyàngshìtànkānjìshùjiàngòuchéngduìxùlièxiāngshìhéfāngfǎyúzhīyuánxiàngliàngjīfēnlèiqì |
_version_ |
1718399112507817984 |