A GPU Scheduling Framework to Accelerate Hyper-Parameter Optimization in Deep Learning Clusters

This paper proposes Hermes, a container-based preemptive GPU scheduling framework for accelerating hyper-parameter optimization in deep learning (DL) clusters. Hermes accelerates hyper-parameter optimization by time-sharing between DL jobs and prioritizing jobs with more promising hyper-parameter co...

Full description

Bibliographic Details
Main Authors: Jaewon Son, Yonghyuk Yoo, Khu-rai Kim, Youngjae Kim, Kwonyong Lee, Sungyong Park
Format: Article
Published: MDPI AG 2021-02-01
Online Access:https://www.mdpi.com/2079-9292/10/3/350