A Lightweight and Flexible Tool for Distinguishing Between Hardware Malfunctions and Program Bugs in Debugging Large-Scale Programs

In this paper, we propose a new technique to distinguish the reason for program failure between hardware malfunctions and program bugs, which mitigates the impact of shorter mean time between failures to the debugging process on the future exa-scale supercomputers and improves the productivity of de...

Full description

Bibliographic Details
Main Authors: Guozhen Zhang, Yi Liu, Hailong Yang, Depei Qian
Format: Article
Language:English
Published: IEEE 2018-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8540813/