Soft Error Resilience of Deep Residual Networks for Object Recognition

Convolutional Neural Networks (CNNs) have truly gained attention in object recognition and object classification in particular. When being implemented on Graphics Processing Units (GPUs), deeper networks are more accurate than shallow ones. Residual Networks (ResNets) are one of the deepest CNN arch...

Full description

Bibliographic Details
Main Authors: Younis Ibrahim, Haibin Wang, Man Bai, Zhi Liu, Jianan Wang, Zhiming Yang, Zhengming Chen
Format: Article
Language:English
Published: IEEE 2020-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8963961/
id doaj-c1b3af0588d940e292b0ca083f9f23bb
record_format Article
spelling doaj-c1b3af0588d940e292b0ca083f9f23bb2021-03-30T02:54:03ZengIEEEIEEE Access2169-35362020-01-018194901950310.1109/ACCESS.2020.29681298963961Soft Error Resilience of Deep Residual Networks for Object RecognitionYounis Ibrahim0Haibin Wang1https://orcid.org/0000-0002-9269-1229Man Bai2Zhi Liu3Jianan Wang4Zhiming Yang5Zhengming Chen6College of IoT Engineering, Hohai University–Changzhou, Changzhou, ChinaCollege of IoT Engineering, Hohai University–Changzhou, Changzhou, ChinaCollege of IoT Engineering, Hohai University–Changzhou, Changzhou, ChinaCollege of IoT Engineering, Hohai University–Changzhou, Changzhou, ChinaNational Key Laboratory of Analog Integrated Circuits, Chongqing, ChinaHarbin Institute of Technology, Harbin, ChinaCollege of IoT Engineering, Hohai University–Changzhou, Changzhou, ChinaConvolutional Neural Networks (CNNs) have truly gained attention in object recognition and object classification in particular. When being implemented on Graphics Processing Units (GPUs), deeper networks are more accurate than shallow ones. Residual Networks (ResNets) are one of the deepest CNN architectures used in various fields including safety-critical ones. GPUs have proven to be the major accelerator for CNN models. However, modern GPUs are prone to radiation-induced soft errors, which is a serious issue in safety-compliant systems. In this work, we analyze and propose an approach to address the reliability of ResNet on GPUs. We firstly analyze three popular ResNet models, explicitly, ResNet-50, ResNet-101, and ResNet-152 through NVIDIA's fault injector, SASSIFI. We perform an in-depth analysis of the model from the perspective of layer and kernel vulnerability. Then, we experimentally show the vulnerability of ResNet models and identify the most vulnerable portions. Finally, we validate our solution, which is a selective-hardening technique, through hardening the worth-hardening kernels to avoid unnecessary overheads. Our strategy is demonstrated to mask up to 93.38% of the injected errors with performance overhead less than 5.35%. Furthermore, the percentage of the errors causing misclassifications can be reduced from 4.2% to 0.104%, thereby significantly improving the model's reliability.https://ieeexplore.ieee.org/document/8963961/Convolutional neural networksresidual networkssafety-critical systemsGPUsreliabilitysoft error
collection DOAJ
language English
format Article
sources DOAJ
author Younis Ibrahim
Haibin Wang
Man Bai
Zhi Liu
Jianan Wang
Zhiming Yang
Zhengming Chen
spellingShingle Younis Ibrahim
Haibin Wang
Man Bai
Zhi Liu
Jianan Wang
Zhiming Yang
Zhengming Chen
Soft Error Resilience of Deep Residual Networks for Object Recognition
IEEE Access
Convolutional neural networks
residual networks
safety-critical systems
GPUs
reliability
soft error
author_facet Younis Ibrahim
Haibin Wang
Man Bai
Zhi Liu
Jianan Wang
Zhiming Yang
Zhengming Chen
author_sort Younis Ibrahim
title Soft Error Resilience of Deep Residual Networks for Object Recognition
title_short Soft Error Resilience of Deep Residual Networks for Object Recognition
title_full Soft Error Resilience of Deep Residual Networks for Object Recognition
title_fullStr Soft Error Resilience of Deep Residual Networks for Object Recognition
title_full_unstemmed Soft Error Resilience of Deep Residual Networks for Object Recognition
title_sort soft error resilience of deep residual networks for object recognition
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2020-01-01
description Convolutional Neural Networks (CNNs) have truly gained attention in object recognition and object classification in particular. When being implemented on Graphics Processing Units (GPUs), deeper networks are more accurate than shallow ones. Residual Networks (ResNets) are one of the deepest CNN architectures used in various fields including safety-critical ones. GPUs have proven to be the major accelerator for CNN models. However, modern GPUs are prone to radiation-induced soft errors, which is a serious issue in safety-compliant systems. In this work, we analyze and propose an approach to address the reliability of ResNet on GPUs. We firstly analyze three popular ResNet models, explicitly, ResNet-50, ResNet-101, and ResNet-152 through NVIDIA's fault injector, SASSIFI. We perform an in-depth analysis of the model from the perspective of layer and kernel vulnerability. Then, we experimentally show the vulnerability of ResNet models and identify the most vulnerable portions. Finally, we validate our solution, which is a selective-hardening technique, through hardening the worth-hardening kernels to avoid unnecessary overheads. Our strategy is demonstrated to mask up to 93.38% of the injected errors with performance overhead less than 5.35%. Furthermore, the percentage of the errors causing misclassifications can be reduced from 4.2% to 0.104%, thereby significantly improving the model's reliability.
topic Convolutional neural networks
residual networks
safety-critical systems
GPUs
reliability
soft error
url https://ieeexplore.ieee.org/document/8963961/
work_keys_str_mv AT younisibrahim softerrorresilienceofdeepresidualnetworksforobjectrecognition
AT haibinwang softerrorresilienceofdeepresidualnetworksforobjectrecognition
AT manbai softerrorresilienceofdeepresidualnetworksforobjectrecognition
AT zhiliu softerrorresilienceofdeepresidualnetworksforobjectrecognition
AT jiananwang softerrorresilienceofdeepresidualnetworksforobjectrecognition
AT zhimingyang softerrorresilienceofdeepresidualnetworksforobjectrecognition
AT zhengmingchen softerrorresilienceofdeepresidualnetworksforobjectrecognition
_version_ 1724184452411686912