A technique for parallel query optimization using MapReduce framework and a semantic-based clustering method
Query optimization is the process of identifying the best Query Execution Plan (QEP). The query optimizer produces a close to optimal QEP for the given queries based on the minimum resource usage. The problem is that for a given query, there are plenty of different equivalent execution plans, each w...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
PeerJ Inc.
2021-06-01
|
Series: | PeerJ Computer Science |
Subjects: | |
Online Access: | https://peerj.com/articles/cs-580.pdf |
id |
doaj-609e9e47089a44368122d6c1dcd894f5 |
---|---|
record_format |
Article |
spelling |
doaj-609e9e47089a44368122d6c1dcd894f52021-06-03T15:05:28ZengPeerJ Inc.PeerJ Computer Science2376-59922021-06-017e58010.7717/peerj-cs.580A technique for parallel query optimization using MapReduce framework and a semantic-based clustering methodElham Azhir0Nima Jafari Navimipour1Mehdi Hosseinzadeh2Arash Sharifi3Aso Darwesh4Department of Computer Engineering, Science and Research Branch, Islamic Azad University, Tehran, IranFuture Technology Research Center, National Yunlin University of Science and Technology, Douliou, Yunlin, Taiwan, R.O.C.Pattern Recognition and Machine Learning Lab, Gachon University, 1342 Seongnamdaero, Sujeonggu, Seongnam, Republic of KoreaDepartment of Computer Engineering, Science and Research Branch, Islamic Azad University, Tehran, IranDepartment of Information Technology, University of Human Development, Sulaymaniyah, IraqQuery optimization is the process of identifying the best Query Execution Plan (QEP). The query optimizer produces a close to optimal QEP for the given queries based on the minimum resource usage. The problem is that for a given query, there are plenty of different equivalent execution plans, each with a corresponding execution cost. To produce an effective query plan thus requires examining a large number of alternative plans. Access plan recommendation is an alternative technique to database query optimization, which reuses the previously-generated QEPs to execute new queries. In this technique, the query optimizer uses clustering methods to identify groups of similar queries. However, clustering such large datasets is challenging for traditional clustering algorithms due to huge processing time. Numerous cloud-based platforms have been introduced that offer low-cost solutions for the processing of distributed queries such as Hadoop, Hive, Pig, etc. This paper has applied and tested a model for clustering variant sizes of large query datasets parallelly using MapReduce. The results demonstrate the effectiveness of the parallel implementation of query workloads clustering to achieve good scalability.https://peerj.com/articles/cs-580.pdfQuery optimizationAccess plan recommendationCluster computingParallel ProcessingMapReduceDBSCAN Algorithm |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Elham Azhir Nima Jafari Navimipour Mehdi Hosseinzadeh Arash Sharifi Aso Darwesh |
spellingShingle |
Elham Azhir Nima Jafari Navimipour Mehdi Hosseinzadeh Arash Sharifi Aso Darwesh A technique for parallel query optimization using MapReduce framework and a semantic-based clustering method PeerJ Computer Science Query optimization Access plan recommendation Cluster computing Parallel Processing MapReduce DBSCAN Algorithm |
author_facet |
Elham Azhir Nima Jafari Navimipour Mehdi Hosseinzadeh Arash Sharifi Aso Darwesh |
author_sort |
Elham Azhir |
title |
A technique for parallel query optimization using MapReduce framework and a semantic-based clustering method |
title_short |
A technique for parallel query optimization using MapReduce framework and a semantic-based clustering method |
title_full |
A technique for parallel query optimization using MapReduce framework and a semantic-based clustering method |
title_fullStr |
A technique for parallel query optimization using MapReduce framework and a semantic-based clustering method |
title_full_unstemmed |
A technique for parallel query optimization using MapReduce framework and a semantic-based clustering method |
title_sort |
technique for parallel query optimization using mapreduce framework and a semantic-based clustering method |
publisher |
PeerJ Inc. |
series |
PeerJ Computer Science |
issn |
2376-5992 |
publishDate |
2021-06-01 |
description |
Query optimization is the process of identifying the best Query Execution Plan (QEP). The query optimizer produces a close to optimal QEP for the given queries based on the minimum resource usage. The problem is that for a given query, there are plenty of different equivalent execution plans, each with a corresponding execution cost. To produce an effective query plan thus requires examining a large number of alternative plans. Access plan recommendation is an alternative technique to database query optimization, which reuses the previously-generated QEPs to execute new queries. In this technique, the query optimizer uses clustering methods to identify groups of similar queries. However, clustering such large datasets is challenging for traditional clustering algorithms due to huge processing time. Numerous cloud-based platforms have been introduced that offer low-cost solutions for the processing of distributed queries such as Hadoop, Hive, Pig, etc. This paper has applied and tested a model for clustering variant sizes of large query datasets parallelly using MapReduce. The results demonstrate the effectiveness of the parallel implementation of query workloads clustering to achieve good scalability. |
topic |
Query optimization Access plan recommendation Cluster computing Parallel Processing MapReduce DBSCAN Algorithm |
url |
https://peerj.com/articles/cs-580.pdf |
work_keys_str_mv |
AT elhamazhir atechniqueforparallelqueryoptimizationusingmapreduceframeworkandasemanticbasedclusteringmethod AT nimajafarinavimipour atechniqueforparallelqueryoptimizationusingmapreduceframeworkandasemanticbasedclusteringmethod AT mehdihosseinzadeh atechniqueforparallelqueryoptimizationusingmapreduceframeworkandasemanticbasedclusteringmethod AT arashsharifi atechniqueforparallelqueryoptimizationusingmapreduceframeworkandasemanticbasedclusteringmethod AT asodarwesh atechniqueforparallelqueryoptimizationusingmapreduceframeworkandasemanticbasedclusteringmethod AT elhamazhir techniqueforparallelqueryoptimizationusingmapreduceframeworkandasemanticbasedclusteringmethod AT nimajafarinavimipour techniqueforparallelqueryoptimizationusingmapreduceframeworkandasemanticbasedclusteringmethod AT mehdihosseinzadeh techniqueforparallelqueryoptimizationusingmapreduceframeworkandasemanticbasedclusteringmethod AT arashsharifi techniqueforparallelqueryoptimizationusingmapreduceframeworkandasemanticbasedclusteringmethod AT asodarwesh techniqueforparallelqueryoptimizationusingmapreduceframeworkandasemanticbasedclusteringmethod |
_version_ |
1721399120600498176 |