Jintishi Processing and Categorization

碩士 === 國立交通大學 === 資訊科學與工程研究所 === 100 === Jintishi is one of the Chinese literature classics. Jintishi reveals rich emotion and thoughts in few words. Jintishi may contain allusions and follows syntactic and semantic parallelisms making them difficult to be understood. Therefore, we used text classif...

Full description

Bibliographic Details
Main Authors: Wang, Sheng-Chuan, 王笙權
Other Authors: Tyne, Liang
Format: Others
Language:zh-TW
Published: 2012
Online Access:http://ndltd.ncl.edu.tw/handle/97832173209005727797
Description
Summary:碩士 === 國立交通大學 === 資訊科學與工程研究所 === 100 === Jintishi is one of the Chinese literature classics. Jintishi reveals rich emotion and thoughts in few words. Jintishi may contain allusions and follows syntactic and semantic parallelisms making them difficult to be understood. Therefore, we used text classification techniques to analyze Jintishi and built up a Jintishi topic identification system. The system provides poem search and poem analysis including word segmentation, semantic tagging, topic identification and emotion identification. We classified Jintishi into six topic categories, namely, Chanting Object, Landscape, Desperate Wife, Farewell, Frontier and Social Poem. Additionally, our system supports emotion categorization, namely, happiness, sadness or anger. We used 992 seven-character Lushi in topic identification labeling experiment. We extracted eight lexical and concept Jintishi features and used support vector machine to identify topics for each poem. We get 69.12% accuracy after ten-fold validation. The emotion identification method was performed and tested too. Using 492 seven-character Lushi as test corpus, we get 70.7% accuracy.