Random Forest Based Searching Approach for RDF
The blend of digital and physical worlds changed the Internet significantly. Accordingly, trends to collect, access, and deliver information have changed over the Web. Such changes raised the problems of information retrieval. Search engines retrieve requested information based on the provided keywo...
Main Author: | |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2020-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9032149/ |
id |
doaj-9f4f85843600408b9957e74445acdee7 |
---|---|
record_format |
Article |
spelling |
doaj-9f4f85843600408b9957e74445acdee72021-03-30T01:28:51ZengIEEEIEEE Access2169-35362020-01-018503675037610.1109/ACCESS.2020.29801559032149Random Forest Based Searching Approach for RDFHatem Soliman0https://orcid.org/0000-0002-7359-0123College of Computer Science and Technology, Nanjing University of Aeronautics and Astronautics, Nanjing, ChinaThe blend of digital and physical worlds changed the Internet significantly. Accordingly, trends to collect, access, and deliver information have changed over the Web. Such changes raised the problems of information retrieval. Search engines retrieve requested information based on the provided keywords which is not an efficient way for rich information retrieval. Consequently, the fetching of the required information is difficult without understanding the syntax and semantics of the content. The multiple existing approaches to resolve this problem by exploiting linked data and semantic Web techniques. Such approaches serialize the content leveraging the Resource Description Framework (RDF) and process the queries using SPARQL to resolve the problem. However, an exact match between RDF content and query structure is required. Although it improves the keyword-based search, it does not provide probabilistic reasoning to find the relationship accuracy between the query and results. In this perspective, this paper proposes a machine learning (random forest) based approach to predict the fetching status of RDF by treating RDFs' requests as a classification problem. First, we preprocess the RDF to convert them into N-Triples format. Then, a feature vector is constructed for each RDF using the preprocessed RDF. After that, a random forest classifier is trained for the prediction of the fetching status of RDFs. The proposed approach is evaluated on an open-source DBpedia dataset. The 10-fold cross-validation results indicate that the performance of the proposed approach is accurate and surpasses the state-of-the-art.https://ieeexplore.ieee.org/document/9032149/Semantic WebRDFmachine learningclassification |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Hatem Soliman |
spellingShingle |
Hatem Soliman Random Forest Based Searching Approach for RDF IEEE Access Semantic Web RDF machine learning classification |
author_facet |
Hatem Soliman |
author_sort |
Hatem Soliman |
title |
Random Forest Based Searching Approach for RDF |
title_short |
Random Forest Based Searching Approach for RDF |
title_full |
Random Forest Based Searching Approach for RDF |
title_fullStr |
Random Forest Based Searching Approach for RDF |
title_full_unstemmed |
Random Forest Based Searching Approach for RDF |
title_sort |
random forest based searching approach for rdf |
publisher |
IEEE |
series |
IEEE Access |
issn |
2169-3536 |
publishDate |
2020-01-01 |
description |
The blend of digital and physical worlds changed the Internet significantly. Accordingly, trends to collect, access, and deliver information have changed over the Web. Such changes raised the problems of information retrieval. Search engines retrieve requested information based on the provided keywords which is not an efficient way for rich information retrieval. Consequently, the fetching of the required information is difficult without understanding the syntax and semantics of the content. The multiple existing approaches to resolve this problem by exploiting linked data and semantic Web techniques. Such approaches serialize the content leveraging the Resource Description Framework (RDF) and process the queries using SPARQL to resolve the problem. However, an exact match between RDF content and query structure is required. Although it improves the keyword-based search, it does not provide probabilistic reasoning to find the relationship accuracy between the query and results. In this perspective, this paper proposes a machine learning (random forest) based approach to predict the fetching status of RDF by treating RDFs' requests as a classification problem. First, we preprocess the RDF to convert them into N-Triples format. Then, a feature vector is constructed for each RDF using the preprocessed RDF. After that, a random forest classifier is trained for the prediction of the fetching status of RDFs. The proposed approach is evaluated on an open-source DBpedia dataset. The 10-fold cross-validation results indicate that the performance of the proposed approach is accurate and surpasses the state-of-the-art. |
topic |
Semantic Web RDF machine learning classification |
url |
https://ieeexplore.ieee.org/document/9032149/ |
work_keys_str_mv |
AT hatemsoliman randomforestbasedsearchingapproachforrdf |
_version_ |
1724186928094380032 |