Summary: | The interest in using scalable data processing solutions based on
Apache Hadoop ecosystem is constantly growing in the High Energy Physics
(HEP) community. This drives the need for increased reliability and availability
of the central Hadoop service and underlying infrastructure provided to the
community by the CERN IT department. This paper reports on the overall status
of the Hadoop platform and related Hadoop and Spark service at CERN,
detailing recent enhancements and features introduced in many areas including
the service configuration, availability, alerting, monitoring and data protection,
in order to meet the new requirements posed by the users’ community.
|