A Fuzzy Ontology and SVM–Based Web Content Classification System

The volume of adult content on the world wide web is increasing rapidly. This makes an automatic detection of adult content a more challenging task, when eliminating access to ill-suited websites. Most pornographic webpage-filtering systems are based on n-gram, naïve Bayes, K-nearest nei...

Full description

Bibliographic Details
Main Authors: Farman Ali, Pervez Khan, Kashif Riaz, Daehan Kwak, Tamer Abuhmed, Daeyoung Park, Kyung Sup Kwak
Format: Article
Language:English
Published: IEEE 2017-01-01
Series:IEEE Access
Subjects:
SVM
Online Access:https://ieeexplore.ieee.org/document/8094233/
id doaj-ac4cb88258df44e08312584683c88cf3
record_format Article
spelling doaj-ac4cb88258df44e08312584683c88cf32021-03-29T19:57:16ZengIEEEIEEE Access2169-35362017-01-015257812579710.1109/ACCESS.2017.27685648094233A Fuzzy Ontology and SVM–Based Web Content Classification SystemFarman Ali0https://orcid.org/0000-0002-9420-1588Pervez Khan1Kashif Riaz2Daehan Kwak3Tamer Abuhmed4Daeyoung Park5Kyung Sup Kwak6Department of Information and Communication Engineering, Inha University, Incheon, South KoreaDepartment of Electronics Engineering, Incheon National University, Incheon, South KoreaUniversity of the Punjab, Gujranwala, PakistanDepartment of Computer Science, Kean University, Union, NJ, USADepartment of Computer Engineering, Inha University, Incheon, South KoreaDepartment of Information and Communication Engineering, Inha University, Incheon, South KoreaDepartment of Information and Communication Engineering, Inha University, Incheon, South KoreaThe volume of adult content on the world wide web is increasing rapidly. This makes an automatic detection of adult content a more challenging task, when eliminating access to ill-suited websites. Most pornographic webpage-filtering systems are based on n-gram, naïve Bayes, K-nearest neighbor, and keyword-matching mechanisms, which do not provide perfect extraction of useful data from unstructured web content. These systems have no reasoning capability to intelligently filter web content to classify medical webpages from adult content webpages. In addition, it is easy for children to access pornographic webpages due to the freely available adult content on the Internet. It creates a problem for parents wishing to protect their children from such unsuitable content. To solve these problems, this paper presents a support vector machine (SVM) and fuzzy ontology-based semantic knowledge system to systematically filter web content and to identify and block access to pornography. The proposed system classifies URLs into adult URLs and medical URLs by using a blacklist of censored webpages to provide accuracy and speed. The proposed fuzzy ontology then extracts web content to find website type (adult content, normal, and medical) and block pornographic content. In order to examine the efficiency of the proposed system, fuzzy ontology, and intelligent tools are developed using Protégé 5.1 and Java, respectively. Experimental analysis shows that the performance of the proposed system is efficient for automatically detecting and blocking adult content.https://ieeexplore.ieee.org/document/8094233/Data miningsemantic knowledgefuzzy ontologySVMadult content identification
collection DOAJ
language English
format Article
sources DOAJ
author Farman Ali
Pervez Khan
Kashif Riaz
Daehan Kwak
Tamer Abuhmed
Daeyoung Park
Kyung Sup Kwak
spellingShingle Farman Ali
Pervez Khan
Kashif Riaz
Daehan Kwak
Tamer Abuhmed
Daeyoung Park
Kyung Sup Kwak
A Fuzzy Ontology and SVM–Based Web Content Classification System
IEEE Access
Data mining
semantic knowledge
fuzzy ontology
SVM
adult content identification
author_facet Farman Ali
Pervez Khan
Kashif Riaz
Daehan Kwak
Tamer Abuhmed
Daeyoung Park
Kyung Sup Kwak
author_sort Farman Ali
title A Fuzzy Ontology and SVM–Based Web Content Classification System
title_short A Fuzzy Ontology and SVM–Based Web Content Classification System
title_full A Fuzzy Ontology and SVM–Based Web Content Classification System
title_fullStr A Fuzzy Ontology and SVM–Based Web Content Classification System
title_full_unstemmed A Fuzzy Ontology and SVM–Based Web Content Classification System
title_sort fuzzy ontology and svm–based web content classification system
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2017-01-01
description The volume of adult content on the world wide web is increasing rapidly. This makes an automatic detection of adult content a more challenging task, when eliminating access to ill-suited websites. Most pornographic webpage-filtering systems are based on n-gram, naïve Bayes, K-nearest neighbor, and keyword-matching mechanisms, which do not provide perfect extraction of useful data from unstructured web content. These systems have no reasoning capability to intelligently filter web content to classify medical webpages from adult content webpages. In addition, it is easy for children to access pornographic webpages due to the freely available adult content on the Internet. It creates a problem for parents wishing to protect their children from such unsuitable content. To solve these problems, this paper presents a support vector machine (SVM) and fuzzy ontology-based semantic knowledge system to systematically filter web content and to identify and block access to pornography. The proposed system classifies URLs into adult URLs and medical URLs by using a blacklist of censored webpages to provide accuracy and speed. The proposed fuzzy ontology then extracts web content to find website type (adult content, normal, and medical) and block pornographic content. In order to examine the efficiency of the proposed system, fuzzy ontology, and intelligent tools are developed using Protégé 5.1 and Java, respectively. Experimental analysis shows that the performance of the proposed system is efficient for automatically detecting and blocking adult content.
topic Data mining
semantic knowledge
fuzzy ontology
SVM
adult content identification
url https://ieeexplore.ieee.org/document/8094233/
work_keys_str_mv AT farmanali afuzzyontologyandsvmx2013basedwebcontentclassificationsystem
AT pervezkhan afuzzyontologyandsvmx2013basedwebcontentclassificationsystem
AT kashifriaz afuzzyontologyandsvmx2013basedwebcontentclassificationsystem
AT daehankwak afuzzyontologyandsvmx2013basedwebcontentclassificationsystem
AT tamerabuhmed afuzzyontologyandsvmx2013basedwebcontentclassificationsystem
AT daeyoungpark afuzzyontologyandsvmx2013basedwebcontentclassificationsystem
AT kyungsupkwak afuzzyontologyandsvmx2013basedwebcontentclassificationsystem
AT farmanali fuzzyontologyandsvmx2013basedwebcontentclassificationsystem
AT pervezkhan fuzzyontologyandsvmx2013basedwebcontentclassificationsystem
AT kashifriaz fuzzyontologyandsvmx2013basedwebcontentclassificationsystem
AT daehankwak fuzzyontologyandsvmx2013basedwebcontentclassificationsystem
AT tamerabuhmed fuzzyontologyandsvmx2013basedwebcontentclassificationsystem
AT daeyoungpark fuzzyontologyandsvmx2013basedwebcontentclassificationsystem
AT kyungsupkwak fuzzyontologyandsvmx2013basedwebcontentclassificationsystem
_version_ 1724195649648328704