From CopeOpi Scores to CopeOpi Vectors: Word Vectors for Multiclass Text Classification

碩士 === 國立交通大學 === 資訊科學與工程研究所 === 106 === In the era of technology, millions of digital texts are generated every day. To derive useful information from these textual data, text mining has become a popular area of both research and business. One of the most important task of text mining is text class...

Full description

Bibliographic Details
Main Authors: Tsai, Pei-Shan, 蔡佩珊
Other Authors: Chen, Ying-Ping
Format: Others
Language:en_US
Published: 2017
Online Access:http://ndltd.ncl.edu.tw/handle/8u36g3
id ndltd-TW-106NCTU5394016
record_format oai_dc
spelling ndltd-TW-106NCTU53940162019-05-16T00:08:11Z http://ndltd.ncl.edu.tw/handle/8u36g3 From CopeOpi Scores to CopeOpi Vectors: Word Vectors for Multiclass Text Classification 從CopeOpi純量擴充至CopeOpi向量:用於多類別本文分類的詞向量 Tsai, Pei-Shan 蔡佩珊 碩士 國立交通大學 資訊科學與工程研究所 106 In the era of technology, millions of digital texts are generated every day. To derive useful information from these textual data, text mining has become a popular area of both research and business. One of the most important task of text mining is text classification. In this thesis, we propose a vector space model for multiclass text classification, the word vectors---CopeOpi vectors. We expand CopeOpi scores which are used in Chinese sentiment analysis, to CopeOpi vectors which can be used in multiclass text classification without the language limit. We verify the functionality of CopeOpi vectors by a series of text classification problems, including sentiment analysis and topic categorization, in both English and Chinese. We make comparisons with several commonly-used features for text classification, and examine these features on different types of machine learning algorithms. The results show that CopeOpi vectors can produce comparable results with a smaller vector size and shorter training time. CopeOpi vectors are effective and efficient features for multiclass text classification. Chen, Ying-Ping 陳穎平 2017 學位論文 ; thesis 43 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 碩士 === 國立交通大學 === 資訊科學與工程研究所 === 106 === In the era of technology, millions of digital texts are generated every day. To derive useful information from these textual data, text mining has become a popular area of both research and business. One of the most important task of text mining is text classification. In this thesis, we propose a vector space model for multiclass text classification, the word vectors---CopeOpi vectors. We expand CopeOpi scores which are used in Chinese sentiment analysis, to CopeOpi vectors which can be used in multiclass text classification without the language limit. We verify the functionality of CopeOpi vectors by a series of text classification problems, including sentiment analysis and topic categorization, in both English and Chinese. We make comparisons with several commonly-used features for text classification, and examine these features on different types of machine learning algorithms. The results show that CopeOpi vectors can produce comparable results with a smaller vector size and shorter training time. CopeOpi vectors are effective and efficient features for multiclass text classification.
author2 Chen, Ying-Ping
author_facet Chen, Ying-Ping
Tsai, Pei-Shan
蔡佩珊
author Tsai, Pei-Shan
蔡佩珊
spellingShingle Tsai, Pei-Shan
蔡佩珊
From CopeOpi Scores to CopeOpi Vectors: Word Vectors for Multiclass Text Classification
author_sort Tsai, Pei-Shan
title From CopeOpi Scores to CopeOpi Vectors: Word Vectors for Multiclass Text Classification
title_short From CopeOpi Scores to CopeOpi Vectors: Word Vectors for Multiclass Text Classification
title_full From CopeOpi Scores to CopeOpi Vectors: Word Vectors for Multiclass Text Classification
title_fullStr From CopeOpi Scores to CopeOpi Vectors: Word Vectors for Multiclass Text Classification
title_full_unstemmed From CopeOpi Scores to CopeOpi Vectors: Word Vectors for Multiclass Text Classification
title_sort from copeopi scores to copeopi vectors: word vectors for multiclass text classification
publishDate 2017
url http://ndltd.ncl.edu.tw/handle/8u36g3
work_keys_str_mv AT tsaipeishan fromcopeopiscorestocopeopivectorswordvectorsformulticlasstextclassification
AT càipèishān fromcopeopiscorestocopeopivectorswordvectorsformulticlasstextclassification
AT tsaipeishan cóngcopeopichúnliàngkuòchōngzhìcopeopixiàngliàngyòngyúduōlèibiéběnwénfēnlèidecíxiàngliàng
AT càipèishān cóngcopeopichúnliàngkuòchōngzhìcopeopixiàngliàngyòngyúduōlèibiéběnwénfēnlèidecíxiàngliàng
_version_ 1719161704069201920