A comprehensive performance analysis of Apache Hadoop and Apache Spark for large scale data sets using HiBench
Abstract Big Data analytics for storing, processing, and analyzing large-scale datasets has become an essential tool for the industry. The advent of distributed computing frameworks such as Hadoop and Spark offers efficient solutions to analyze vast amounts of data. Due to the application programmin...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
SpringerOpen
2020-12-01
|
Series: | Journal of Big Data |
Subjects: | |
Online Access: | https://doi.org/10.1186/s40537-020-00388-5 |