Query Optimization for Database Federation Systems

Database federation is one approach to data integration, in which a middleware, called mediator, provides uniform access to a number of heterogeneous data sources. In this thesis, we focus on the query optimization for distributed joins over database federation. One important observation in query o...

Full description

Bibliographic Details
Main Author:	Wang, Di
Other Authors:	Elke A. Rundensteiner, Reader
Format:	Others
Published:	Digital WPI 2009
Subjects:	database federation query optimization
Online Access:	https://digitalcommons.wpi.edu/etd-theses/718 https://digitalcommons.wpi.edu/cgi/viewcontent.cgi?article=1717&context=etd-theses

id	ndltd-wpi.edu-oai-digitalcommons.wpi.edu-etd-theses-1717
record_format	oai_dc
spelling	ndltd-wpi.edu-oai-digitalcommons.wpi.edu-etd-theses-17172019-03-22T05:49:40Z Query Optimization for Database Federation Systems Wang, Di Database federation is one approach to data integration, in which a middleware, called mediator, provides uniform access to a number of heterogeneous data sources. In this thesis, we focus on the query optimization for distributed joins over database federation. One important observation in query optimization over distributed database system is that run-time conditions (namely available buffer size, CPU utilization in machine and network environment) can significantly affect the execution cost of a query plan. However, in existing database federation systems, very few studies have addressed run-time conditions. It is a challenging problem, because usually the mediator is not able to know the run-time conditions of remote sites and considering run-time conditions will bring about extra complexity to the optimizer. This thesis proposes the Cluster-and-Conquer algorithm for query optimization over database federation while efficiently considering run-time conditions. This algorithm has three-fold benefits. Firstly, the run-time conditions of machines are now available for cluster mediator. Secondly, each cluster mediator can deal with its own sub query concurrently, so the complexity of processing query plan is decreased. Thirdly, the algorithm outperforms other related approaches in terms of“cost of costing", because it removes unnecessary inter-cluster operations in the early stage. I have implemented a prototype data federation system with Cluster-and-Conquer algorithm. The experimental results showed the capabilities and efficiency of our algorithm and described the target scenarios where the algorithm performs better than other related approaches. 2009-05-04T07:00:00Z text application/pdf https://digitalcommons.wpi.edu/etd-theses/718 https://digitalcommons.wpi.edu/cgi/viewcontent.cgi?article=1717&context=etd-theses Masters Theses (All Theses, All Years) Digital WPI Elke A. Rundensteiner, Reader Murali Mani, Advisor database federation query optimization
collection	NDLTD
format	Others
sources	NDLTD
topic	database federation query optimization
spellingShingle	database federation query optimization Wang, Di Query Optimization for Database Federation Systems
description	Database federation is one approach to data integration, in which a middleware, called mediator, provides uniform access to a number of heterogeneous data sources. In this thesis, we focus on the query optimization for distributed joins over database federation. One important observation in query optimization over distributed database system is that run-time conditions (namely available buffer size, CPU utilization in machine and network environment) can significantly affect the execution cost of a query plan. However, in existing database federation systems, very few studies have addressed run-time conditions. It is a challenging problem, because usually the mediator is not able to know the run-time conditions of remote sites and considering run-time conditions will bring about extra complexity to the optimizer. This thesis proposes the Cluster-and-Conquer algorithm for query optimization over database federation while efficiently considering run-time conditions. This algorithm has three-fold benefits. Firstly, the run-time conditions of machines are now available for cluster mediator. Secondly, each cluster mediator can deal with its own sub query concurrently, so the complexity of processing query plan is decreased. Thirdly, the algorithm outperforms other related approaches in terms of“cost of costing", because it removes unnecessary inter-cluster operations in the early stage. I have implemented a prototype data federation system with Cluster-and-Conquer algorithm. The experimental results showed the capabilities and efficiency of our algorithm and described the target scenarios where the algorithm performs better than other related approaches.
author2	Elke A. Rundensteiner, Reader
author_facet	Elke A. Rundensteiner, Reader Wang, Di
author	Wang, Di
author_sort	Wang, Di
title	Query Optimization for Database Federation Systems
title_short	Query Optimization for Database Federation Systems
title_full	Query Optimization for Database Federation Systems
title_fullStr	Query Optimization for Database Federation Systems
title_full_unstemmed	Query Optimization for Database Federation Systems
title_sort	query optimization for database federation systems
publisher	Digital WPI
publishDate	2009
url	https://digitalcommons.wpi.edu/etd-theses/718 https://digitalcommons.wpi.edu/cgi/viewcontent.cgi?article=1717&context=etd-theses
work_keys_str_mv	AT wangdi queryoptimizationfordatabasefederationsystems
_version_	1719006264743165952

Query Optimization for Database Federation Systems

Similar Items