Random Forest Based Searching Approach for RDF

The blend of digital and physical worlds changed the Internet significantly. Accordingly, trends to collect, access, and deliver information have changed over the Web. Such changes raised the problems of information retrieval. Search engines retrieve requested information based on the provided keywo...

Full description

Bibliographic Details
Main Author: Hatem Soliman
Format: Article
Language:English
Published: IEEE 2020-01-01
Series:IEEE Access
Subjects:
RDF
Online Access:https://ieeexplore.ieee.org/document/9032149/
id doaj-9f4f85843600408b9957e74445acdee7
record_format Article
spelling doaj-9f4f85843600408b9957e74445acdee72021-03-30T01:28:51ZengIEEEIEEE Access2169-35362020-01-018503675037610.1109/ACCESS.2020.29801559032149Random Forest Based Searching Approach for RDFHatem Soliman0https://orcid.org/0000-0002-7359-0123College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing, ChinaThe blend of digital and physical worlds changed the Internet significantly. Accordingly, trends to collect, access, and deliver information have changed over the Web. Such changes raised the problems of information retrieval. Search engines retrieve requested information based on the provided keywords which is not an efficient way for rich information retrieval. Consequently, the fetching of the required information is difficult without understanding the syntax and semantics of the content. The multiple existing approaches to resolve this problem by exploiting linked data and semantic Web techniques. Such approaches serialize the content leveraging the Resource Description Framework (RDF) and process the queries using SPARQL to resolve the problem. However, an exact match between RDF content and query structure is required. Although it improves the keyword-based search, it does not provide probabilistic reasoning to find the relationship accuracy between the query and results. In this perspective, this paper proposes a machine learning (random forest) based approach to predict the fetching status of RDF by treating RDFs' requests as a classification problem. First, we preprocess the RDF to convert them into N-Triples format. Then, a feature vector is constructed for each RDF using the preprocessed RDF. After that, a random forest classifier is trained for the prediction of the fetching status of RDFs. The proposed approach is evaluated on an open-source DBpedia dataset. The 10-fold cross-validation results indicate that the performance of the proposed approach is accurate and surpasses the state-of-the-art.https://ieeexplore.ieee.org/document/9032149/Semantic WebRDFmachine learningclassification
collection DOAJ
language English
format Article
sources DOAJ
author Hatem Soliman
spellingShingle Hatem Soliman
Random Forest Based Searching Approach for RDF
IEEE Access
Semantic Web
RDF
machine learning
classification
author_facet Hatem Soliman
author_sort Hatem Soliman
title Random Forest Based Searching Approach for RDF
title_short Random Forest Based Searching Approach for RDF
title_full Random Forest Based Searching Approach for RDF
title_fullStr Random Forest Based Searching Approach for RDF
title_full_unstemmed Random Forest Based Searching Approach for RDF
title_sort random forest based searching approach for rdf
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2020-01-01
description The blend of digital and physical worlds changed the Internet significantly. Accordingly, trends to collect, access, and deliver information have changed over the Web. Such changes raised the problems of information retrieval. Search engines retrieve requested information based on the provided keywords which is not an efficient way for rich information retrieval. Consequently, the fetching of the required information is difficult without understanding the syntax and semantics of the content. The multiple existing approaches to resolve this problem by exploiting linked data and semantic Web techniques. Such approaches serialize the content leveraging the Resource Description Framework (RDF) and process the queries using SPARQL to resolve the problem. However, an exact match between RDF content and query structure is required. Although it improves the keyword-based search, it does not provide probabilistic reasoning to find the relationship accuracy between the query and results. In this perspective, this paper proposes a machine learning (random forest) based approach to predict the fetching status of RDF by treating RDFs' requests as a classification problem. First, we preprocess the RDF to convert them into N-Triples format. Then, a feature vector is constructed for each RDF using the preprocessed RDF. After that, a random forest classifier is trained for the prediction of the fetching status of RDFs. The proposed approach is evaluated on an open-source DBpedia dataset. The 10-fold cross-validation results indicate that the performance of the proposed approach is accurate and surpasses the state-of-the-art.
topic Semantic Web
RDF
machine learning
classification
url https://ieeexplore.ieee.org/document/9032149/
work_keys_str_mv AT hatemsoliman randomforestbasedsearchingapproachforrdf
_version_ 1724186928094380032