Efficient near duplicate document detection for specialized corpora
Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2009. === Includes bibliographical references (p. 75-77). === Knowledge of near duplicate documents can be adventagous to search engines, even those that only cover a small enterprise or sp...
Main Author: | |
---|---|
Other Authors: | |
Format: | Others |
Language: | English |
Published: |
Massachusetts Institute of Technology
2010
|
Subjects: | |
Online Access: | http://hdl.handle.net/1721.1/53116 |