Summary: | 碩士 === 國立臺灣海洋大學 === 資訊工程學系 === 98 === The main topic of this thesis is to identify opinion targets, not only detecting whether an opinionated sentence has a target or not, but also determining which entity is commented with this opinion. By doing so, opinionated sentences targeting to the same entity can be collected and make a summary of opinions for this entity.
This thesis aims at the domain of tourism thus the opinion targets are tourist attractions. The experimental data comes from blog articles in the domestic tourism category on Wretch.cc. Annotators were asked to annotate the opinion polarity and the opinion target for every sentence. Different strategies and features have been proposed to identify opinion targets, including attraction name affix substrings, attraction name keywords, tourism-related opinion words, a 2-level classifier, and so on. We used machine learning methods to train classifiers for opinion target identification.
Experiments are conducted on a test set of 156 blog articles collected from Wretch.cc. The overall precision and recall scores of 1-level classifier are 46.77% and 53.16%, respectively. Our 2-level classifier first detects the occurrences of opinion targets and then identifies the exact opinion targets. The overall precision and recall scores can reach 51.30% and 54.21%, respectively. The precision and recall scores of opinion target detection are 55.89% and 59.30%, respectively. The precision and recall scores of opinion target identification are 90.06% and 89.91%, respectively.
|