Position-Weighted Measures for the Company Name-Matching Problem

碩士 === 國立臺灣大學 === 經濟學研究所 === 104 === This thesis focuses on the company name-matching problem. We analyze common errors and complications in company names committed by users that make the company name-matching problem difficult. Although the company name-matching problem is a type of name-matching p...

Full description

Bibliographic Details
Main Authors: Ching-Kuo Li, 李清國
Other Authors: Yusen Sung
Format: Others
Language:en_US
Published: 2016
Online Access:http://ndltd.ncl.edu.tw/handle/26453571913738870132
Description
Summary:碩士 === 國立臺灣大學 === 經濟學研究所 === 104 === This thesis focuses on the company name-matching problem. We analyze common errors and complications in company names committed by users that make the company name-matching problem difficult. Although the company name-matching problem is a type of name-matching problem, it has special features that make these common name-matching methods barely the best choice in the company name-matching problem. Therefore, according to the construction of the company name, we propose a novel idea of position weight to address company name-matching problem. Then, we compare our proposed position-weighted measure with the Monge-Elkan measure and the soft TF/IDF in the popular business data set and two data sets from a major semiconductors manufacturer. The result indicates that the position-weighted measure performs best overall based on maximum F1 and our proposed rating measure in the company name-matching problem. Besides the company name, the position weighted measure can also be used in some name-matching problems that have similar construction with the company name.