Query Optimization for Database Federation Systems
Database federation is one approach to data integration, in which a middleware, called mediator, provides uniform access to a number of heterogeneous data sources. In this thesis, we focus on the query optimization for distributed joins over database federation. One important observation in query o...
Main Author: | |
---|---|
Other Authors: | |
Format: | Others |
Published: |
Digital WPI
2009
|
Subjects: | |
Online Access: | https://digitalcommons.wpi.edu/etd-theses/718 https://digitalcommons.wpi.edu/cgi/viewcontent.cgi?article=1717&context=etd-theses |
id |
ndltd-wpi.edu-oai-digitalcommons.wpi.edu-etd-theses-1717 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-wpi.edu-oai-digitalcommons.wpi.edu-etd-theses-17172019-03-22T05:49:40Z Query Optimization for Database Federation Systems Wang, Di Database federation is one approach to data integration, in which a middleware, called mediator, provides uniform access to a number of heterogeneous data sources. In this thesis, we focus on the query optimization for distributed joins over database federation. One important observation in query optimization over distributed database system is that run-time conditions (namely available buffer size, CPU utilization in machine and network environment) can significantly affect the execution cost of a query plan. However, in existing database federation systems, very few studies have addressed run-time conditions. It is a challenging problem, because usually the mediator is not able to know the run-time conditions of remote sites and considering run-time conditions will bring about extra complexity to the optimizer. This thesis proposes the Cluster-and-Conquer algorithm for query optimization over database federation while efficiently considering run-time conditions. This algorithm has three-fold benefits. Firstly, the run-time conditions of machines are now available for cluster mediator. Secondly, each cluster mediator can deal with its own sub query concurrently, so the complexity of processing query plan is decreased. Thirdly, the algorithm outperforms other related approaches in terms of“cost of costing", because it removes unnecessary inter-cluster operations in the early stage. I have implemented a prototype data federation system with Cluster-and-Conquer algorithm. The experimental results showed the capabilities and efficiency of our algorithm and described the target scenarios where the algorithm performs better than other related approaches. 2009-05-04T07:00:00Z text application/pdf https://digitalcommons.wpi.edu/etd-theses/718 https://digitalcommons.wpi.edu/cgi/viewcontent.cgi?article=1717&context=etd-theses Masters Theses (All Theses, All Years) Digital WPI Elke A. Rundensteiner, Reader Murali Mani, Advisor database federation query optimization |
collection |
NDLTD |
format |
Others
|
sources |
NDLTD |
topic |
database federation query optimization |
spellingShingle |
database federation query optimization Wang, Di Query Optimization for Database Federation Systems |
description |
Database federation is one approach to data integration, in which a middleware, called mediator, provides uniform access to a number of heterogeneous data sources. In this thesis, we focus on the query optimization for distributed joins over database federation. One important observation in query optimization over distributed database system is that run-time conditions (namely available buffer size, CPU utilization in machine and network environment) can significantly affect the execution cost of a query plan. However, in existing database federation systems, very few studies have addressed run-time conditions. It is a challenging problem, because usually the mediator is not able to know the run-time conditions of remote sites and considering run-time conditions will bring about extra complexity to the optimizer. This thesis proposes the Cluster-and-Conquer algorithm for query optimization over database federation while efficiently considering run-time conditions. This algorithm has three-fold benefits. Firstly, the run-time conditions of machines are now available for cluster mediator. Secondly, each cluster mediator can deal with its own sub query concurrently, so the complexity of processing query plan is decreased. Thirdly, the algorithm outperforms other related approaches in terms of“cost of costing", because it removes unnecessary inter-cluster operations in the early stage. I have implemented a prototype data federation system with Cluster-and-Conquer algorithm. The experimental results showed the capabilities and efficiency of our algorithm and described the target scenarios where the algorithm performs better than other related approaches. |
author2 |
Elke A. Rundensteiner, Reader |
author_facet |
Elke A. Rundensteiner, Reader Wang, Di |
author |
Wang, Di |
author_sort |
Wang, Di |
title |
Query Optimization for Database Federation Systems |
title_short |
Query Optimization for Database Federation Systems |
title_full |
Query Optimization for Database Federation Systems |
title_fullStr |
Query Optimization for Database Federation Systems |
title_full_unstemmed |
Query Optimization for Database Federation Systems |
title_sort |
query optimization for database federation systems |
publisher |
Digital WPI |
publishDate |
2009 |
url |
https://digitalcommons.wpi.edu/etd-theses/718 https://digitalcommons.wpi.edu/cgi/viewcontent.cgi?article=1717&context=etd-theses |
work_keys_str_mv |
AT wangdi queryoptimizationfordatabasefederationsystems |
_version_ |
1719006264743165952 |