Evaluating partitioning and bucketing strategies for Hive-based Big Data Warehousing systems

Abstract Hive has long been one of the industry-leading systems for Data Warehousing in Big Data contexts, mainly organizing data into databases, tables, partitions and buckets, stored on top of an unstructured distributed file system like HDFS. Some studies were conducted for understanding the ways...

Full description

Bibliographic Details
Main Authors: Eduarda Costa, Carlos Costa, Maribel Yasmina Santos
Format: Article
Language:English
Published: SpringerOpen 2019-05-01
Series:Journal of Big Data
Subjects:
Online Access:http://link.springer.com/article/10.1186/s40537-019-0196-1