A Multimodel Fusion Engine for Filtering Webpages

Fusing multiple existing models for filtering webpages can mitigate the shortcomings of individual filtering models. To provide an engine for such fusion, we propose a multimodel fusion engine for filtering webpages for the extraction of target webpages. This engine can handle large datasets of webp...

Full description

Bibliographic Details
Main Authors: Ziyun Deng, Tingqin He, Weiping Ding, Zehong Cao
Format: Article
Language:English
Published: IEEE 2018-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8528301/
id doaj-d3b791e2564f4be0bfb4efb866824d9f
record_format Article
spelling doaj-d3b791e2564f4be0bfb4efb866824d9f2021-03-29T20:28:28ZengIEEEIEEE Access2169-35362018-01-016660626607110.1109/ACCESS.2018.28788978528301A Multimodel Fusion Engine for Filtering WebpagesZiyun Deng0https://orcid.org/0000-0003-1276-5222Tingqin He1https://orcid.org/0000-0001-7890-7567Weiping Ding2https://orcid.org/0000-0002-3180-7347Zehong Cao3https://orcid.org/0000-0003-3656-0328College of Economics and Trade, Changsha Commerce and Tourism College, Changsha, ChinaNational Supercomputing Center in Changsha, Hunan University, Changsha, ChinaSchool of Computer Science and Technology, Nantong University, Nantong, ChinaCentre for Artificial Intelligence, Faculty of Engineering and Information Technologies, University of Technology Sydney, Ultimo, NSW, AustraliaFusing multiple existing models for filtering webpages can mitigate the shortcomings of individual filtering models. To provide an engine for such fusion, we propose a multimodel fusion engine for filtering webpages for the extraction of target webpages. This engine can handle large datasets of webpages crawled from websites and supports five individual filtering models and the fusion of any two of them. There are two possible fusion methods: one is to simultaneously satisfy the conditions of both individual models, and the other is to satisfy the conditions of one of the two individual models. We present the functions, architecture, and software design of the proposed engine. We use recall ratio (RR) and precision ratio (PR) as the evaluation indices of the filtering models and propose rules describing how PR and RR change when individual models are fused. We use 200 000 webpages collected by crawling the popular online shopping website &#x201C;<uri>http://www.jd.com</uri>&#x201D; as the experimental dataset to verify these rules. The experimental results show that two-model fusion can improve either PR or RR. Thus, the proposed engine has good practical value for engineering applications.https://ieeexplore.ieee.org/document/8528301/Multimodelfusionengine designwebpage filtering
collection DOAJ
language English
format Article
sources DOAJ
author Ziyun Deng
Tingqin He
Weiping Ding
Zehong Cao
spellingShingle Ziyun Deng
Tingqin He
Weiping Ding
Zehong Cao
A Multimodel Fusion Engine for Filtering Webpages
IEEE Access
Multimodel
fusion
engine design
webpage filtering
author_facet Ziyun Deng
Tingqin He
Weiping Ding
Zehong Cao
author_sort Ziyun Deng
title A Multimodel Fusion Engine for Filtering Webpages
title_short A Multimodel Fusion Engine for Filtering Webpages
title_full A Multimodel Fusion Engine for Filtering Webpages
title_fullStr A Multimodel Fusion Engine for Filtering Webpages
title_full_unstemmed A Multimodel Fusion Engine for Filtering Webpages
title_sort multimodel fusion engine for filtering webpages
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2018-01-01
description Fusing multiple existing models for filtering webpages can mitigate the shortcomings of individual filtering models. To provide an engine for such fusion, we propose a multimodel fusion engine for filtering webpages for the extraction of target webpages. This engine can handle large datasets of webpages crawled from websites and supports five individual filtering models and the fusion of any two of them. There are two possible fusion methods: one is to simultaneously satisfy the conditions of both individual models, and the other is to satisfy the conditions of one of the two individual models. We present the functions, architecture, and software design of the proposed engine. We use recall ratio (RR) and precision ratio (PR) as the evaluation indices of the filtering models and propose rules describing how PR and RR change when individual models are fused. We use 200 000 webpages collected by crawling the popular online shopping website &#x201C;<uri>http://www.jd.com</uri>&#x201D; as the experimental dataset to verify these rules. The experimental results show that two-model fusion can improve either PR or RR. Thus, the proposed engine has good practical value for engineering applications.
topic Multimodel
fusion
engine design
webpage filtering
url https://ieeexplore.ieee.org/document/8528301/
work_keys_str_mv AT ziyundeng amultimodelfusionengineforfilteringwebpages
AT tingqinhe amultimodelfusionengineforfilteringwebpages
AT weipingding amultimodelfusionengineforfilteringwebpages
AT zehongcao amultimodelfusionengineforfilteringwebpages
AT ziyundeng multimodelfusionengineforfilteringwebpages
AT tingqinhe multimodelfusionengineforfilteringwebpages
AT weipingding multimodelfusionengineforfilteringwebpages
AT zehongcao multimodelfusionengineforfilteringwebpages
_version_ 1724194830650703872