From CopeOpi Scores to CopeOpi Vectors: Word Vectors for Multiclass Text Classification
碩士 === 國立交通大學 === 資訊科學與工程研究所 === 106 === In the era of technology, millions of digital texts are generated every day. To derive useful information from these textual data, text mining has become a popular area of both research and business. One of the most important task of text mining is text class...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2017
|
Online Access: | http://ndltd.ncl.edu.tw/handle/8u36g3 |
id |
ndltd-TW-106NCTU5394016 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-106NCTU53940162019-05-16T00:08:11Z http://ndltd.ncl.edu.tw/handle/8u36g3 From CopeOpi Scores to CopeOpi Vectors: Word Vectors for Multiclass Text Classification 從CopeOpi純量擴充至CopeOpi向量:用於多類別本文分類的詞向量 Tsai, Pei-Shan 蔡佩珊 碩士 國立交通大學 資訊科學與工程研究所 106 In the era of technology, millions of digital texts are generated every day. To derive useful information from these textual data, text mining has become a popular area of both research and business. One of the most important task of text mining is text classification. In this thesis, we propose a vector space model for multiclass text classification, the word vectors---CopeOpi vectors. We expand CopeOpi scores which are used in Chinese sentiment analysis, to CopeOpi vectors which can be used in multiclass text classification without the language limit. We verify the functionality of CopeOpi vectors by a series of text classification problems, including sentiment analysis and topic categorization, in both English and Chinese. We make comparisons with several commonly-used features for text classification, and examine these features on different types of machine learning algorithms. The results show that CopeOpi vectors can produce comparable results with a smaller vector size and shorter training time. CopeOpi vectors are effective and efficient features for multiclass text classification. Chen, Ying-Ping 陳穎平 2017 學位論文 ; thesis 43 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立交通大學 === 資訊科學與工程研究所 === 106 === In the era of technology, millions of digital texts are generated every day. To derive useful information from these textual data, text mining has become a popular area of both research and business. One of the most important task of text mining is text classification.
In this thesis, we propose a vector space model for multiclass text classification, the word vectors---CopeOpi vectors. We expand CopeOpi scores which are used in Chinese sentiment analysis, to CopeOpi vectors which can be used in multiclass text classification without the language limit.
We verify the functionality of CopeOpi vectors by a series of text classification problems, including sentiment analysis and topic categorization, in both English and Chinese. We make comparisons with several commonly-used features for text classification, and examine these features on different types of machine learning algorithms. The results show that CopeOpi vectors can produce comparable results with a smaller vector size and shorter training time. CopeOpi vectors are effective and efficient features for multiclass text classification.
|
author2 |
Chen, Ying-Ping |
author_facet |
Chen, Ying-Ping Tsai, Pei-Shan 蔡佩珊 |
author |
Tsai, Pei-Shan 蔡佩珊 |
spellingShingle |
Tsai, Pei-Shan 蔡佩珊 From CopeOpi Scores to CopeOpi Vectors: Word Vectors for Multiclass Text Classification |
author_sort |
Tsai, Pei-Shan |
title |
From CopeOpi Scores to CopeOpi Vectors: Word Vectors for Multiclass Text Classification |
title_short |
From CopeOpi Scores to CopeOpi Vectors: Word Vectors for Multiclass Text Classification |
title_full |
From CopeOpi Scores to CopeOpi Vectors: Word Vectors for Multiclass Text Classification |
title_fullStr |
From CopeOpi Scores to CopeOpi Vectors: Word Vectors for Multiclass Text Classification |
title_full_unstemmed |
From CopeOpi Scores to CopeOpi Vectors: Word Vectors for Multiclass Text Classification |
title_sort |
from copeopi scores to copeopi vectors: word vectors for multiclass text classification |
publishDate |
2017 |
url |
http://ndltd.ncl.edu.tw/handle/8u36g3 |
work_keys_str_mv |
AT tsaipeishan fromcopeopiscorestocopeopivectorswordvectorsformulticlasstextclassification AT càipèishān fromcopeopiscorestocopeopivectorswordvectorsformulticlasstextclassification AT tsaipeishan cóngcopeopichúnliàngkuòchōngzhìcopeopixiàngliàngyòngyúduōlèibiéběnwénfēnlèidecíxiàngliàng AT càipèishān cóngcopeopichúnliàngkuòchōngzhìcopeopixiàngliàngyòngyúduōlèibiéběnwénfēnlèidecíxiàngliàng |
_version_ |
1719161704069201920 |