A comprehensive performance analysis of Apache Hadoop and Apache Spark for large scale data sets using HiBench

Abstract Big Data analytics for storing, processing, and analyzing large-scale datasets has become an essential tool for the industry. The advent of distributed computing frameworks such as Hadoop and Spark offers efficient solutions to analyze vast amounts of data. Due to the application programmin...

Full description

Bibliographic Details
Main Authors: N. Ahmed, Andre L. C. Barczak, Teo Susnjak, Mohammed A. Rashid
Format: Article
Language:English
Published: SpringerOpen 2020-12-01
Series:Journal of Big Data
Subjects:
Online Access:https://doi.org/10.1186/s40537-020-00388-5