Efficient string similarity join in multi-core and distributed systems.

In big data area a significant challenge about string similarity join is to find all similar pairs more efficiently. In this paper, we propose a parallel processing framework for efficient string similarity join. First, the input is split into some disjoint small subsets according to the joint frequ...

Full description

Bibliographic Details
Main Authors: Cairong Yan, Xue Zhao, Qinglong Zhang, Yongfeng Huang
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2017-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC5344375?pdf=render