Mining Blogger's Glossary Impressions from a Micro Blog Corpus
碩士 === 國立清華大學 === 資訊工程學系 === 97 === Micro blog is a new type of service of blog over the last two years. The purpose of micro blog is providing a platform which user can share their life or mood in real-time. Unlike the traditional blog, the article of micro blog is short. They usually only contain...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2009
|
Online Access: | http://ndltd.ncl.edu.tw/handle/68140753239717608290 |
id |
ndltd-TW-097NTHU5392135 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-097NTHU53921352015-11-13T04:08:49Z http://ndltd.ncl.edu.tw/handle/68140753239717608290 Mining Blogger's Glossary Impressions from a Micro Blog Corpus 從微部落格語料中探勘部落客的詞彙意象 Yang, Ting-Hao 楊庭豪 碩士 國立清華大學 資訊工程學系 97 Micro blog is a new type of service of blog over the last two years. The purpose of micro blog is providing a platform which user can share their life or mood in real-time. Unlike the traditional blog, the article of micro blog is short. They usually only contain a few words. Because micro blog is a new service in last two year, the researches about micro blog are lacking. There is some attached information annotated by blogger in micro blog, for example, the label of article and emotion icons. In this paper, we propose a method to use them and semantic information to determine what the impression of the specific term the blogger has. First, we parse the corpus and extract the semantic concepts contained in the article. We use the POS tags and dependence relations between terms to extract semantic structure. After extracting, we use Latent Semantic Analysis(LSA) to find which articles have similar semantic concept. We assume the articles with similar semantic concepts or shared the same attached information may have relation of similar impression. We represent the article of micro blog as nodes and these relations as edge to construct a network of micro blog. When the network is complete, we use a random walk model to describe the process of impression-transmitting between articles. In experimental part, we collect corpus from a micro blog website Plurk. We trace a blogger for a few months, and we classify terms about person and object into positive, negative and neutral. We select about 100 terms for experiment. The precision of experiment is 69%. Soo, Von-Wun 蘇豐文 2009 學位論文 ; thesis 35 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立清華大學 === 資訊工程學系 === 97 === Micro blog is a new type of service of blog over the last two years. The purpose of micro blog is providing a platform which user can share their life or mood in real-time. Unlike the traditional blog, the article of micro blog is short. They usually only contain a few words. Because micro blog is a new service in last two year, the researches about micro blog are lacking.
There is some attached information annotated by blogger in micro blog, for example, the label of article and emotion icons. In this paper, we propose a method to use them and semantic information to determine what the impression of the specific term the blogger has. First, we parse the corpus and extract the semantic concepts contained in the article. We use the POS tags and dependence relations between terms to extract semantic structure. After extracting, we use Latent Semantic Analysis(LSA) to find which articles have similar semantic concept. We assume the articles with similar semantic concepts or shared the same attached information may have relation of similar impression. We represent the article of micro blog as nodes and these relations as edge to construct a network of micro blog. When the network is complete, we use a random walk model to describe the process of impression-transmitting between articles.
In experimental part, we collect corpus from a micro blog website Plurk. We trace a blogger for a few months, and we classify terms about person and object into positive, negative and neutral. We select about 100 terms for experiment. The precision of experiment is 69%.
|
author2 |
Soo, Von-Wun |
author_facet |
Soo, Von-Wun Yang, Ting-Hao 楊庭豪 |
author |
Yang, Ting-Hao 楊庭豪 |
spellingShingle |
Yang, Ting-Hao 楊庭豪 Mining Blogger's Glossary Impressions from a Micro Blog Corpus |
author_sort |
Yang, Ting-Hao |
title |
Mining Blogger's Glossary Impressions from a Micro Blog Corpus |
title_short |
Mining Blogger's Glossary Impressions from a Micro Blog Corpus |
title_full |
Mining Blogger's Glossary Impressions from a Micro Blog Corpus |
title_fullStr |
Mining Blogger's Glossary Impressions from a Micro Blog Corpus |
title_full_unstemmed |
Mining Blogger's Glossary Impressions from a Micro Blog Corpus |
title_sort |
mining blogger's glossary impressions from a micro blog corpus |
publishDate |
2009 |
url |
http://ndltd.ncl.edu.tw/handle/68140753239717608290 |
work_keys_str_mv |
AT yangtinghao miningbloggersglossaryimpressionsfromamicroblogcorpus AT yángtíngháo miningbloggersglossaryimpressionsfromamicroblogcorpus AT yangtinghao cóngwēibùluògéyǔliàozhōngtànkānbùluòkèdecíhuìyìxiàng AT yángtíngháo cóngwēibùluògéyǔliàozhōngtànkānbùluòkèdecíhuìyìxiàng |
_version_ |
1718128349560176640 |