Identify Human Rellationship From Retrieved Snippets

碩士 === 國立交通大學 === 資訊科學與工程研究所 === 100 === Identifying relation among entities is an important task in document processing. The relations identified in previous researches include co-working relations between persons and organizations, relations among diseases and medicines, relations between authors...

Full description

Bibliographic Details
Main Authors: Nieh, Chia-Chi, 聶家祺
Other Authors: Liang, Tyne
Format: Others
Language:en_US
Published: 2011
Online Access:http://ndltd.ncl.edu.tw/handle/62196018224601952925
id ndltd-TW-100NCTU5394049
record_format oai_dc
spelling ndltd-TW-100NCTU53940492015-10-13T20:37:28Z http://ndltd.ncl.edu.tw/handle/62196018224601952925 Identify Human Rellationship From Retrieved Snippets 從搜尋結果進行人際關係辨識 Nieh, Chia-Chi 聶家祺 碩士 國立交通大學 資訊科學與工程研究所 100 Identifying relation among entities is an important task in document processing. The relations identified in previous researches include co-working relations between persons and organizations, relations among diseases and medicines, relations between authors and artifacts, the interactions between proteins, and the equivalence relations among nominals etc... Most identification methods are based on machine learning algorithms or pattern matching and few are based on parsing result. Besides, the corpora used for relation identification can be static and dynamic (like search engine results). Although identifying relations from static corpus generally outperforms the methods using dynamic corpora, yet dynamic corpora contain more updated information. In this thesis, we employ retrieved snippets to identify human relationships and Wikipedia to construct developing corpus. We extract domain words from developing corpus by the bootstrapping algorithm and expand queries for accurate search results. To speed up document processing, simple methods are implemented for part-of-speech tagging, person name tagging and pronominal anaphor resolution. The proposed kinship identification is implemented by pattern matching and support vector machine (SVM). The Features to be used at identification includes the amount and position of clue words and cosine similarity of entities related to persons. The kinship identifier yields 0.86 f-score in the experiment containing 396 kinship instances and the co-working identifier yields 0.75 f-score on 175 co-working instances. Liang, Tyne 梁婷 2011 學位論文 ; thesis 32 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 碩士 === 國立交通大學 === 資訊科學與工程研究所 === 100 === Identifying relation among entities is an important task in document processing. The relations identified in previous researches include co-working relations between persons and organizations, relations among diseases and medicines, relations between authors and artifacts, the interactions between proteins, and the equivalence relations among nominals etc... Most identification methods are based on machine learning algorithms or pattern matching and few are based on parsing result. Besides, the corpora used for relation identification can be static and dynamic (like search engine results). Although identifying relations from static corpus generally outperforms the methods using dynamic corpora, yet dynamic corpora contain more updated information. In this thesis, we employ retrieved snippets to identify human relationships and Wikipedia to construct developing corpus. We extract domain words from developing corpus by the bootstrapping algorithm and expand queries for accurate search results. To speed up document processing, simple methods are implemented for part-of-speech tagging, person name tagging and pronominal anaphor resolution. The proposed kinship identification is implemented by pattern matching and support vector machine (SVM). The Features to be used at identification includes the amount and position of clue words and cosine similarity of entities related to persons. The kinship identifier yields 0.86 f-score in the experiment containing 396 kinship instances and the co-working identifier yields 0.75 f-score on 175 co-working instances.
author2 Liang, Tyne
author_facet Liang, Tyne
Nieh, Chia-Chi
聶家祺
author Nieh, Chia-Chi
聶家祺
spellingShingle Nieh, Chia-Chi
聶家祺
Identify Human Rellationship From Retrieved Snippets
author_sort Nieh, Chia-Chi
title Identify Human Rellationship From Retrieved Snippets
title_short Identify Human Rellationship From Retrieved Snippets
title_full Identify Human Rellationship From Retrieved Snippets
title_fullStr Identify Human Rellationship From Retrieved Snippets
title_full_unstemmed Identify Human Rellationship From Retrieved Snippets
title_sort identify human rellationship from retrieved snippets
publishDate 2011
url http://ndltd.ncl.edu.tw/handle/62196018224601952925
work_keys_str_mv AT niehchiachi identifyhumanrellationshipfromretrievedsnippets
AT nièjiāqí identifyhumanrellationshipfromretrievedsnippets
AT niehchiachi cóngsōuxúnjiéguǒjìnxíngrénjìguānxìbiànshí
AT nièjiāqí cóngsōuxúnjiéguǒjìnxíngrénjìguānxìbiànshí
_version_ 1718050652405366784