Identify Human Rellationship From Retrieved Snippets
碩士 === 國立交通大學 === 資訊科學與工程研究所 === 100 === Identifying relation among entities is an important task in document processing. The relations identified in previous researches include co-working relations between persons and organizations, relations among diseases and medicines, relations between authors...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2011
|
Online Access: | http://ndltd.ncl.edu.tw/handle/62196018224601952925 |
id |
ndltd-TW-100NCTU5394049 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-100NCTU53940492015-10-13T20:37:28Z http://ndltd.ncl.edu.tw/handle/62196018224601952925 Identify Human Rellationship From Retrieved Snippets 從搜尋結果進行人際關係辨識 Nieh, Chia-Chi 聶家祺 碩士 國立交通大學 資訊科學與工程研究所 100 Identifying relation among entities is an important task in document processing. The relations identified in previous researches include co-working relations between persons and organizations, relations among diseases and medicines, relations between authors and artifacts, the interactions between proteins, and the equivalence relations among nominals etc... Most identification methods are based on machine learning algorithms or pattern matching and few are based on parsing result. Besides, the corpora used for relation identification can be static and dynamic (like search engine results). Although identifying relations from static corpus generally outperforms the methods using dynamic corpora, yet dynamic corpora contain more updated information. In this thesis, we employ retrieved snippets to identify human relationships and Wikipedia to construct developing corpus. We extract domain words from developing corpus by the bootstrapping algorithm and expand queries for accurate search results. To speed up document processing, simple methods are implemented for part-of-speech tagging, person name tagging and pronominal anaphor resolution. The proposed kinship identification is implemented by pattern matching and support vector machine (SVM). The Features to be used at identification includes the amount and position of clue words and cosine similarity of entities related to persons. The kinship identifier yields 0.86 f-score in the experiment containing 396 kinship instances and the co-working identifier yields 0.75 f-score on 175 co-working instances. Liang, Tyne 梁婷 2011 學位論文 ; thesis 32 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立交通大學 === 資訊科學與工程研究所 === 100 === Identifying relation among entities is an important task in document processing. The relations identified in previous researches include co-working relations between persons and organizations, relations among diseases and medicines, relations between authors and artifacts, the interactions between proteins, and the equivalence relations among nominals etc... Most identification methods are based on machine learning algorithms or pattern matching and few are based on parsing result. Besides, the corpora used for relation identification can be static and dynamic (like search engine results). Although identifying relations from static corpus generally outperforms the methods using dynamic corpora, yet dynamic corpora contain more updated information. In this thesis, we employ retrieved snippets to identify human relationships and Wikipedia to construct developing corpus. We extract domain words from developing corpus by the bootstrapping algorithm and expand queries for accurate search results. To speed up document processing, simple methods are implemented for part-of-speech tagging, person name tagging and pronominal anaphor resolution. The proposed kinship identification is implemented by pattern matching and support vector machine (SVM). The Features to be used at identification includes the amount and position of clue words and cosine similarity of entities related to persons. The kinship identifier yields 0.86 f-score in the experiment containing 396 kinship instances and the co-working identifier yields 0.75 f-score on 175 co-working instances.
|
author2 |
Liang, Tyne |
author_facet |
Liang, Tyne Nieh, Chia-Chi 聶家祺 |
author |
Nieh, Chia-Chi 聶家祺 |
spellingShingle |
Nieh, Chia-Chi 聶家祺 Identify Human Rellationship From Retrieved Snippets |
author_sort |
Nieh, Chia-Chi |
title |
Identify Human Rellationship From Retrieved Snippets |
title_short |
Identify Human Rellationship From Retrieved Snippets |
title_full |
Identify Human Rellationship From Retrieved Snippets |
title_fullStr |
Identify Human Rellationship From Retrieved Snippets |
title_full_unstemmed |
Identify Human Rellationship From Retrieved Snippets |
title_sort |
identify human rellationship from retrieved snippets |
publishDate |
2011 |
url |
http://ndltd.ncl.edu.tw/handle/62196018224601952925 |
work_keys_str_mv |
AT niehchiachi identifyhumanrellationshipfromretrievedsnippets AT nièjiāqí identifyhumanrellationshipfromretrievedsnippets AT niehchiachi cóngsōuxúnjiéguǒjìnxíngrénjìguānxìbiànshí AT nièjiāqí cóngsōuxúnjiéguǒjìnxíngrénjìguānxìbiànshí |
_version_ |
1718050652405366784 |