Failure Prediction using Machine Learning in a Virtualised HPC System and application
Yes === Failure is an increasingly important issue in high performance computing and cloud systems. As large-scale systems continue to grow in scale and complexity, mitigating the impact of failure and providing accurate predictions with sufficient lead time remains a challenging research problem...
Main Authors: | , , , |
---|---|
Language: | en |
Published: |
2019
|
Subjects: | |
Online Access: | http://hdl.handle.net/10454/16892 |