Enhanced Bitmap Indexes for Large Scale Data Management
Main Author: | |
---|---|
Language: | English |
Published: |
The Ohio State University / OhioLINK
2009
|
Subjects: | |
Online Access: | http://rave.ohiolink.edu/etdc/view?acc_num=osu1244047153 |
id |
ndltd-OhioLink-oai-etd.ohiolink.edu-osu1244047153 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-OhioLink-oai-etd.ohiolink.edu-osu12440471532021-08-03T05:56:21Z Enhanced Bitmap Indexes for Large Scale Data Management Canahuate, Guadalupe M. bitmap index scientific data management large scale indexing Advances in technology have enabled the production of massive volumes of data through observations and simulations in many application domains.These new data sets and the associated queries pose a new challenge for efficient storage and data retrieval that requires novel indexing structures and algorithms. We propose a series of enhancements to bitmap indexes to make them account for the inherent characteristics of large scale datasets and to efficiently support the type of queries needed to analyze the data. First, we formalize how missing data should be handled and how queries should be executed in the presence of missing data. Then, we propose an adaptive code ordering as a hybrid between Gray code and lexicographic orderings to reorganize the data and further reduce the size of the already compressed bitmaps. We address the inability of the compressed bitmaps to directly access a given row by proposing an approximate encoding that compresses the bitmap in a hash structure. We also extend the existing run-length encoders of bitmap indexes by adding an extra word to represent future rows with zeros and minimize the insertion overhead of new data. We propose a comprehensive framework to execute similarity searches over the bitmap indexes without changes to the current bitmap structure, without accessing the original data, and using a similarity function that is meaningful in high dimensional spaces. Finally, we propose a new encoding and query execution for non-clustered bitmap indexes that combines several attributes in one existence bitmap, reduces the storage requirement, and improves query execution and update time for low cardinality attributes. 2009-09-08 English text The Ohio State University / OhioLINK http://rave.ohiolink.edu/etdc/view?acc_num=osu1244047153 http://rave.ohiolink.edu/etdc/view?acc_num=osu1244047153 unrestricted This thesis or dissertation is protected by copyright: all rights reserved. It may not be copied or redistributed beyond the terms of applicable copyright laws. |
collection |
NDLTD |
language |
English |
sources |
NDLTD |
topic |
bitmap index scientific data management large scale indexing |
spellingShingle |
bitmap index scientific data management large scale indexing Canahuate, Guadalupe M. Enhanced Bitmap Indexes for Large Scale Data Management |
author |
Canahuate, Guadalupe M. |
author_facet |
Canahuate, Guadalupe M. |
author_sort |
Canahuate, Guadalupe M. |
title |
Enhanced Bitmap Indexes for Large Scale Data Management |
title_short |
Enhanced Bitmap Indexes for Large Scale Data Management |
title_full |
Enhanced Bitmap Indexes for Large Scale Data Management |
title_fullStr |
Enhanced Bitmap Indexes for Large Scale Data Management |
title_full_unstemmed |
Enhanced Bitmap Indexes for Large Scale Data Management |
title_sort |
enhanced bitmap indexes for large scale data management |
publisher |
The Ohio State University / OhioLINK |
publishDate |
2009 |
url |
http://rave.ohiolink.edu/etdc/view?acc_num=osu1244047153 |
work_keys_str_mv |
AT canahuateguadalupem enhancedbitmapindexesforlargescaledatamanagement |
_version_ |
1719428122971996160 |