Integration of Topic Hierarchies without Mutually Labeled Data
碩士 === 國立中正大學 === 資訊工程研究所 === 92 === In the problem of integrating documents from different sources into a comprehensive topic hierarchy, the objective is to develop efficient techniques that improve the accuracy of traditional categorization methods by incorporating categorization inform...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2004
|
Online Access: | http://ndltd.ncl.edu.tw/handle/72828003476969356494 |
id |
ndltd-TW-092CCU00392071 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-092CCU003920712016-01-04T04:08:29Z http://ndltd.ncl.edu.tw/handle/72828003476969356494 Integration of Topic Hierarchies without Mutually Labeled Data 階層式分類目錄在沒有共同標記資料上的整合 Chi-Wei Hung 洪啟偉 碩士 國立中正大學 資訊工程研究所 92 In the problem of integrating documents from different sources into a comprehensive topic hierarchy, the objective is to develop efficient techniques that improve the accuracy of traditional categorization methods by incorporating categorization information provided by data sources into categorization process. Notice that in the World-Wide Web, categorization information is often available from information sources. Observe that many of the topic hierarchies adopted by current information sources are highly related. We believe that categorization information can be used to improve classification accuracy. However, this kind of problem need mutually labeled data between two hierarchies. Maybe we have no enough mutually labeled documents between two hierarchies in the World-Wide Web, or even no mutually labeled data completely. In the thesis, we study the problem of integrating documents from different sources into a comprehensive topic hierarchy without mutually labeled data. To solve this problem, the Bayesian Extension algorithm will need a predicting algorithm. We present several techniques that predict relations between topic hierarchies and incorporate categorization information from source hierarchies into traditional classification methods. Experiment on collections from Openfind and Yam, and Google and Yahoo, well-known popular web sites, shows that incorporating predicted mapping from source hierarchies to target hierarchies can improve the classification accuracy. Jyh-Jong Tsay 蔡志忠 2004 學位論文 ; thesis 0 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立中正大學 === 資訊工程研究所 === 92 === In the problem of integrating documents from different sources into a comprehensive topic hierarchy, the objective is to develop efficient techniques that improve the accuracy of traditional categorization methods by incorporating categorization information provided by data sources into categorization process. Notice that in the World-Wide Web, categorization information is often available from information sources. Observe that many of the topic hierarchies adopted by current information sources are highly related. We believe that categorization information can be used to improve classification accuracy. However, this kind of problem need mutually labeled data between two hierarchies. Maybe we have no enough mutually labeled documents between two hierarchies in the World-Wide Web, or even no mutually labeled data completely. In the thesis, we study the problem of integrating documents from different sources into a comprehensive topic hierarchy without mutually labeled data. To solve this problem, the Bayesian Extension algorithm will need a predicting algorithm. We present several techniques that predict relations between topic hierarchies and incorporate categorization information from source hierarchies into traditional classification methods. Experiment on collections from Openfind and Yam, and Google and Yahoo, well-known popular web sites, shows that incorporating predicted mapping from source hierarchies to target hierarchies can improve the classification accuracy.
|
author2 |
Jyh-Jong Tsay |
author_facet |
Jyh-Jong Tsay Chi-Wei Hung 洪啟偉 |
author |
Chi-Wei Hung 洪啟偉 |
spellingShingle |
Chi-Wei Hung 洪啟偉 Integration of Topic Hierarchies without Mutually Labeled Data |
author_sort |
Chi-Wei Hung |
title |
Integration of Topic Hierarchies without Mutually Labeled Data |
title_short |
Integration of Topic Hierarchies without Mutually Labeled Data |
title_full |
Integration of Topic Hierarchies without Mutually Labeled Data |
title_fullStr |
Integration of Topic Hierarchies without Mutually Labeled Data |
title_full_unstemmed |
Integration of Topic Hierarchies without Mutually Labeled Data |
title_sort |
integration of topic hierarchies without mutually labeled data |
publishDate |
2004 |
url |
http://ndltd.ncl.edu.tw/handle/72828003476969356494 |
work_keys_str_mv |
AT chiweihung integrationoftopichierarchieswithoutmutuallylabeleddata AT hóngqǐwěi integrationoftopichierarchieswithoutmutuallylabeleddata AT chiweihung jiēcéngshìfēnlèimùlùzàiméiyǒugòngtóngbiāojìzīliàoshàngdezhěnghé AT hóngqǐwěi jiēcéngshìfēnlèimùlùzàiméiyǒugòngtóngbiāojìzīliàoshàngdezhěnghé |
_version_ |
1718158151694417920 |