An Effective Splog Detector for Chinese Weblogs

碩士 === 靜宜大學 === 資訊碩士在職專班 === 99 === The splogs are those blogs created for commercial purposes and purely copies contents from other blogs or webs. The splogs will block the contents of regular blogs and deprecate the value of the blog web server. The problems of detecting the splogs become a resear...

Full description

Bibliographic Details
Main Authors: Chia-Ping Chen, 陳佳萍
Other Authors: Meng-Chien Yang
Format: Others
Language:zh-TW
Published: 2010
Online Access:http://ndltd.ncl.edu.tw/handle/38570048217529202939
Description
Summary:碩士 === 靜宜大學 === 資訊碩士在職專班 === 99 === The splogs are those blogs created for commercial purposes and purely copies contents from other blogs or webs. The splogs will block the contents of regular blogs and deprecate the value of the blog web server. The problems of detecting the splogs become a research topic in web engineering recently. In this thesis, the methodologies and algorithms for detecting the splogs from the Chinese web logs are proposed. The goal of this study is to help blog server maintainers to detect and reduce the possible barricading splogs so that the functions of the web server can be maintained normally. In this paper, a new approach mixed with three factors of blog’s to create similarity matrix and a framework based on the SVM algorithm is proposed. This proposed method can correctly detect the Chinese splogs from a large set of Chinese weblogs and reduce the possible traffic loads of the Chinese weblog server.