The Optimization Strategies on Clarification of the Misconceptions of Big Data Processing in Dynamic and Opportunistic Environments

This paper identifies four common misconceptions about the scalability of volunteer computing on big data problems. The misconceptions are then clarified by analyzing the relationship between scalability and the impact factors including the problem size of big data, the heterogeneity and dynamics of...

Full description

Bibliographic Details
Main Authors: Wei Li, Maolin Tang
Format: Article
Language:English
Published: MDPI AG 2021-08-01
Series:Big Data and Cognitive Computing
Subjects:
Online Access:https://www.mdpi.com/2504-2289/5/3/38
Description
Summary:This paper identifies four common misconceptions about the scalability of volunteer computing on big data problems. The misconceptions are then clarified by analyzing the relationship between scalability and the impact factors including the problem size of big data, the heterogeneity and dynamics of volunteers, and the overlay structure. This paper proposes optimization strategies to find the optimal overlay for the given big data problem. This paper forms multiple overlays to optimize the performance of individual steps in terms of MapReduce paradigm. The optimization is to achieve the maximum overall performance by using a minimum number of volunteers, not overusing resources. This paper has demonstrated that the simulations on the concerned factors can fast find the optimization points. This paper concludes that always welcoming more volunteers is an overuse of available resources because they do not always bring benefit to the overall performance. Finding optimal use of volunteers are possible for the given big data problems even on the dynamics and opportunism of volunteers.
ISSN:2504-2289