Bi-perceptron for Chinese Web News Categorization
碩士 === 國立臺灣大學 === 資訊工程學研究所 === 104 === Mobile news, due to its natural attributes of high frequency, has become a popular area pursued by many commercial companies in China. News categorization is an important technology in news automatic process. Many supervised learning methods can be applied in t...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2016
|
Online Access: | http://ndltd.ncl.edu.tw/handle/27135496119770819830 |
id |
ndltd-TW-104NTU05392061 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-104NTU053920612017-06-03T04:41:59Z http://ndltd.ncl.edu.tw/handle/27135496119770819830 Bi-perceptron for Chinese Web News Categorization Bi-perceptron 分類中文網頁新聞 Jian Pan 潘健 碩士 國立臺灣大學 資訊工程學研究所 104 Mobile news, due to its natural attributes of high frequency, has become a popular area pursued by many commercial companies in China. News categorization is an important technology in news automatic process. Many supervised learning methods can be applied in this area, where Support Vector Machine(SVM) achieves the state-of-art performance with discrete features. This paper provides the idea of bi-perceptron learning to solve the binary-class classification problem in the hope of achieving comparable or even better results than SVM. Bi-perceptron learning is a divide-and-conquer idea. We proposed this idea in this paper and realized a basic approach of it. We divided the classification problem into three steps: data partition, base classification and aggregation and compared different partition and aggregation methods. Moreover, we analyzed the effect of word segmentation methods, keywords number, the regularization of base classifiers and partition number on the categorization performance. Finally, we find an approach of bi-perceptron learning that is perfect in both time and memory consumption. Cheng-Yuan Liou 劉長遠 2016 學位論文 ; thesis 72 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立臺灣大學 === 資訊工程學研究所 === 104 === Mobile news, due to its natural attributes of high frequency, has become a popular
area pursued by many commercial companies in China. News categorization is an important technology in news automatic process. Many supervised learning methods can be applied in this area, where Support Vector Machine(SVM) achieves the state-of-art performance with discrete features. This paper provides the idea of bi-perceptron learning to solve the binary-class classification problem in the hope of achieving comparable or even better results than SVM.
Bi-perceptron learning is a divide-and-conquer idea. We proposed this idea in this paper and realized a basic approach of it. We divided the classification problem into three steps: data partition, base classification and aggregation and compared different partition and aggregation methods. Moreover, we analyzed the effect of word segmentation methods, keywords number, the regularization of base classifiers and partition number on the categorization performance. Finally, we find an approach of bi-perceptron learning that is perfect in both time and memory consumption.
|
author2 |
Cheng-Yuan Liou |
author_facet |
Cheng-Yuan Liou Jian Pan 潘健 |
author |
Jian Pan 潘健 |
spellingShingle |
Jian Pan 潘健 Bi-perceptron for Chinese Web News Categorization |
author_sort |
Jian Pan |
title |
Bi-perceptron for Chinese Web News Categorization |
title_short |
Bi-perceptron for Chinese Web News Categorization |
title_full |
Bi-perceptron for Chinese Web News Categorization |
title_fullStr |
Bi-perceptron for Chinese Web News Categorization |
title_full_unstemmed |
Bi-perceptron for Chinese Web News Categorization |
title_sort |
bi-perceptron for chinese web news categorization |
publishDate |
2016 |
url |
http://ndltd.ncl.edu.tw/handle/27135496119770819830 |
work_keys_str_mv |
AT jianpan biperceptronforchinesewebnewscategorization AT pānjiàn biperceptronforchinesewebnewscategorization AT jianpan biperceptronfēnlèizhōngwénwǎngyèxīnwén AT pānjiàn biperceptronfēnlèizhōngwénwǎngyèxīnwén |
_version_ |
1718455043438411776 |