Mining Blogger's Glossary Impressions from a Micro Blog Corpus

碩士 === 國立清華大學 === 資訊工程學系 === 97 === Micro blog is a new type of service of blog over the last two years. The purpose of micro blog is providing a platform which user can share their life or mood in real-time. Unlike the traditional blog, the article of micro blog is short. They usually only contain...

Full description

Bibliographic Details
Main Authors: Yang, Ting-Hao, 楊庭豪
Other Authors: Soo, Von-Wun
Format: Others
Language:en_US
Published: 2009
Online Access:http://ndltd.ncl.edu.tw/handle/68140753239717608290
id ndltd-TW-097NTHU5392135
record_format oai_dc
spelling ndltd-TW-097NTHU53921352015-11-13T04:08:49Z http://ndltd.ncl.edu.tw/handle/68140753239717608290 Mining Blogger's Glossary Impressions from a Micro Blog Corpus 從微部落格語料中探勘部落客的詞彙意象 Yang, Ting-Hao 楊庭豪 碩士 國立清華大學 資訊工程學系 97 Micro blog is a new type of service of blog over the last two years. The purpose of micro blog is providing a platform which user can share their life or mood in real-time. Unlike the traditional blog, the article of micro blog is short. They usually only contain a few words. Because micro blog is a new service in last two year, the researches about micro blog are lacking. There is some attached information annotated by blogger in micro blog, for example, the label of article and emotion icons. In this paper, we propose a method to use them and semantic information to determine what the impression of the specific term the blogger has. First, we parse the corpus and extract the semantic concepts contained in the article. We use the POS tags and dependence relations between terms to extract semantic structure. After extracting, we use Latent Semantic Analysis(LSA) to find which articles have similar semantic concept. We assume the articles with similar semantic concepts or shared the same attached information may have relation of similar impression. We represent the article of micro blog as nodes and these relations as edge to construct a network of micro blog. When the network is complete, we use a random walk model to describe the process of impression-transmitting between articles. In experimental part, we collect corpus from a micro blog website Plurk. We trace a blogger for a few months, and we classify terms about person and object into positive, negative and neutral. We select about 100 terms for experiment. The precision of experiment is 69%. Soo, Von-Wun 蘇豐文 2009 學位論文 ; thesis 35 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 碩士 === 國立清華大學 === 資訊工程學系 === 97 === Micro blog is a new type of service of blog over the last two years. The purpose of micro blog is providing a platform which user can share their life or mood in real-time. Unlike the traditional blog, the article of micro blog is short. They usually only contain a few words. Because micro blog is a new service in last two year, the researches about micro blog are lacking. There is some attached information annotated by blogger in micro blog, for example, the label of article and emotion icons. In this paper, we propose a method to use them and semantic information to determine what the impression of the specific term the blogger has. First, we parse the corpus and extract the semantic concepts contained in the article. We use the POS tags and dependence relations between terms to extract semantic structure. After extracting, we use Latent Semantic Analysis(LSA) to find which articles have similar semantic concept. We assume the articles with similar semantic concepts or shared the same attached information may have relation of similar impression. We represent the article of micro blog as nodes and these relations as edge to construct a network of micro blog. When the network is complete, we use a random walk model to describe the process of impression-transmitting between articles. In experimental part, we collect corpus from a micro blog website Plurk. We trace a blogger for a few months, and we classify terms about person and object into positive, negative and neutral. We select about 100 terms for experiment. The precision of experiment is 69%.
author2 Soo, Von-Wun
author_facet Soo, Von-Wun
Yang, Ting-Hao
楊庭豪
author Yang, Ting-Hao
楊庭豪
spellingShingle Yang, Ting-Hao
楊庭豪
Mining Blogger's Glossary Impressions from a Micro Blog Corpus
author_sort Yang, Ting-Hao
title Mining Blogger's Glossary Impressions from a Micro Blog Corpus
title_short Mining Blogger's Glossary Impressions from a Micro Blog Corpus
title_full Mining Blogger's Glossary Impressions from a Micro Blog Corpus
title_fullStr Mining Blogger's Glossary Impressions from a Micro Blog Corpus
title_full_unstemmed Mining Blogger's Glossary Impressions from a Micro Blog Corpus
title_sort mining blogger's glossary impressions from a micro blog corpus
publishDate 2009
url http://ndltd.ncl.edu.tw/handle/68140753239717608290
work_keys_str_mv AT yangtinghao miningbloggersglossaryimpressionsfromamicroblogcorpus
AT yángtíngháo miningbloggersglossaryimpressionsfromamicroblogcorpus
AT yangtinghao cóngwēibùluògéyǔliàozhōngtànkānbùluòkèdecíhuìyìxiàng
AT yángtíngháo cóngwēibùluògéyǔliàozhōngtànkānbùluòkèdecíhuìyìxiàng
_version_ 1718128349560176640