Containment Domains: A Scalable, Efficient and Flexible Resilience Scheme for Exascale Systems

This paper describes and evaluates a scalable and efficient resilience scheme based on the concept of containment domains. Containment domains are a programming construct that enable applications to express resilience needs and to interact with the system to tune and specialize error detection, stat...

Full description

Bibliographic Details
Main Authors: Jinsuk Chung, Ikhwan Lee, Michael Sullivan, Jee Ho Ryoo, Dong Wan Kim, Doe Hyun Yoon, Larry Kaplan, Mattan Erez
Format: Article
Language:English
Published: Hindawi Limited 2013-01-01
Series:Scientific Programming
Online Access:http://dx.doi.org/10.3233/SPR-130374