Enhanced Bitmap Indexes for Large Scale Data Management

Bibliographic Details
Main Author: Canahuate, Guadalupe M.
Language:English
Published: The Ohio State University / OhioLINK 2009
Subjects:
Online Access:http://rave.ohiolink.edu/etdc/view?acc_num=osu1244047153
id ndltd-OhioLink-oai-etd.ohiolink.edu-osu1244047153
record_format oai_dc
spelling ndltd-OhioLink-oai-etd.ohiolink.edu-osu12440471532021-08-03T05:56:21Z Enhanced Bitmap Indexes for Large Scale Data Management Canahuate, Guadalupe M. bitmap index scientific data management large scale indexing Advances in technology have enabled the production of massive volumes of data through observations and simulations in many application domains.These new data sets and the associated queries pose a new challenge for efficient storage and data retrieval that requires novel indexing structures and algorithms. We propose a series of enhancements to bitmap indexes to make them account for the inherent characteristics of large scale datasets and to efficiently support the type of queries needed to analyze the data. First, we formalize how missing data should be handled and how queries should be executed in the presence of missing data. Then, we propose an adaptive code ordering as a hybrid between Gray code and lexicographic orderings to reorganize the data and further reduce the size of the already compressed bitmaps. We address the inability of the compressed bitmaps to directly access a given row by proposing an approximate encoding that compresses the bitmap in a hash structure. We also extend the existing run-length encoders of bitmap indexes by adding an extra word to represent future rows with zeros and minimize the insertion overhead of new data. We propose a comprehensive framework to execute similarity searches over the bitmap indexes without changes to the current bitmap structure, without accessing the original data, and using a similarity function that is meaningful in high dimensional spaces. Finally, we propose a new encoding and query execution for non-clustered bitmap indexes that combines several attributes in one existence bitmap, reduces the storage requirement, and improves query execution and update time for low cardinality attributes. 2009-09-08 English text The Ohio State University / OhioLINK http://rave.ohiolink.edu/etdc/view?acc_num=osu1244047153 http://rave.ohiolink.edu/etdc/view?acc_num=osu1244047153 unrestricted This thesis or dissertation is protected by copyright: all rights reserved. It may not be copied or redistributed beyond the terms of applicable copyright laws.
collection NDLTD
language English
sources NDLTD
topic bitmap index
scientific data management
large scale indexing
spellingShingle bitmap index
scientific data management
large scale indexing
Canahuate, Guadalupe M.
Enhanced Bitmap Indexes for Large Scale Data Management
author Canahuate, Guadalupe M.
author_facet Canahuate, Guadalupe M.
author_sort Canahuate, Guadalupe M.
title Enhanced Bitmap Indexes for Large Scale Data Management
title_short Enhanced Bitmap Indexes for Large Scale Data Management
title_full Enhanced Bitmap Indexes for Large Scale Data Management
title_fullStr Enhanced Bitmap Indexes for Large Scale Data Management
title_full_unstemmed Enhanced Bitmap Indexes for Large Scale Data Management
title_sort enhanced bitmap indexes for large scale data management
publisher The Ohio State University / OhioLINK
publishDate 2009
url http://rave.ohiolink.edu/etdc/view?acc_num=osu1244047153
work_keys_str_mv AT canahuateguadalupem enhancedbitmapindexesforlargescaledatamanagement
_version_ 1719428122971996160