A Thermal-Aware On-Line Fault Tolerance Method for TSV Lifetime Reliability in 3D-NoC Systems
Through-Silicon-Via (TSV) based 3D Integrated Circuits (3D-IC) are one of the most advanced architectures by providing low power consumption, shorter wire length and smaller footprint. However, 3D-ICs confront lifetime reliability due to high operating temperature and interconnect reliability, espec...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2020-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9189765/ |
id |
doaj-f3682c1c3eb24acfbcfeac489780c887 |
---|---|
record_format |
Article |
spelling |
doaj-f3682c1c3eb24acfbcfeac489780c8872021-03-30T03:27:26ZengIEEEIEEE Access2169-35362020-01-01816664216665710.1109/ACCESS.2020.30229049189765A Thermal-Aware On-Line Fault Tolerance Method for TSV Lifetime Reliability in 3D-NoC SystemsKhanh N. Dang0https://orcid.org/0000-0001-6702-3870Akram Ben Ahmed1https://orcid.org/0000-0002-1253-8620Abderazek Ben Abdallah2https://orcid.org/0000-0003-3432-0718Xuan-Tu Tran3https://orcid.org/0000-0003-4259-9579VNU Key Laboratory for Smart Integrated Systems (SISLAB), VNU University of Engineering and Technology (VNU-UET), Vietnam National University,Hanoi (VNU), Hanoi, VietnamNational Institute of Advanced Industrial Science and Technology (AIST), Tsukuba, JapanAdaptive Systems Laboratory, The University of Aizu, Aizu-Wakamatsu, JapanVNU Key Laboratory for Smart Integrated Systems (SISLAB), VNU University of Engineering and Technology (VNU-UET), Vietnam National University,Hanoi (VNU), Hanoi, VietnamThrough-Silicon-Via (TSV) based 3D Integrated Circuits (3D-IC) are one of the most advanced architectures by providing low power consumption, shorter wire length and smaller footprint. However, 3D-ICs confront lifetime reliability due to high operating temperature and interconnect reliability, especially the Through-Silicon-Via (TSV), which can significantly affect the accuracy of the applications. In this paper, we present an online method that supports the detection and correction of lifetime TSV failures, named IaSiG. By reusing the conventional recovery method and analyzing the output syndromes, IaSiG can determine and correct the defective TSVs. Results show that within a group, R redundant TSVs can fully localize and correct R defects and support the detection of R+1 defects. Moreover, by using G groups, it can localize up to GxR and detect up to G x (R + 1) defects. An implementation of IaSiG for 32-bit data in eight groups and two redundancies has a worst-case execution time (WCET) of 5,152 cycles while supporting at most 16 defective TSVs (50% localization). By integrating IaSiG onto a 3D Network-on-Chip, we also perform a grid-search based empirical method to insert suitable numbers of redundancies into TSV groups. The empirical method takes the operating temperature as the factor of accelerated fault due to the fact that temperature is one of the major issues of 3D-ICs. The results show that the proposed method can reduce the number of redundancies from the uniform method while still maintaining the required Mean Time to Failure.https://ieeexplore.ieee.org/document/9189765/Fault-tolerancefault detectionparity checkthrough silicon viareal-timethermal aware |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Khanh N. Dang Akram Ben Ahmed Abderazek Ben Abdallah Xuan-Tu Tran |
spellingShingle |
Khanh N. Dang Akram Ben Ahmed Abderazek Ben Abdallah Xuan-Tu Tran A Thermal-Aware On-Line Fault Tolerance Method for TSV Lifetime Reliability in 3D-NoC Systems IEEE Access Fault-tolerance fault detection parity check through silicon via real-time thermal aware |
author_facet |
Khanh N. Dang Akram Ben Ahmed Abderazek Ben Abdallah Xuan-Tu Tran |
author_sort |
Khanh N. Dang |
title |
A Thermal-Aware On-Line Fault Tolerance Method for TSV Lifetime Reliability in 3D-NoC Systems |
title_short |
A Thermal-Aware On-Line Fault Tolerance Method for TSV Lifetime Reliability in 3D-NoC Systems |
title_full |
A Thermal-Aware On-Line Fault Tolerance Method for TSV Lifetime Reliability in 3D-NoC Systems |
title_fullStr |
A Thermal-Aware On-Line Fault Tolerance Method for TSV Lifetime Reliability in 3D-NoC Systems |
title_full_unstemmed |
A Thermal-Aware On-Line Fault Tolerance Method for TSV Lifetime Reliability in 3D-NoC Systems |
title_sort |
thermal-aware on-line fault tolerance method for tsv lifetime reliability in 3d-noc systems |
publisher |
IEEE |
series |
IEEE Access |
issn |
2169-3536 |
publishDate |
2020-01-01 |
description |
Through-Silicon-Via (TSV) based 3D Integrated Circuits (3D-IC) are one of the most advanced architectures by providing low power consumption, shorter wire length and smaller footprint. However, 3D-ICs confront lifetime reliability due to high operating temperature and interconnect reliability, especially the Through-Silicon-Via (TSV), which can significantly affect the accuracy of the applications. In this paper, we present an online method that supports the detection and correction of lifetime TSV failures, named IaSiG. By reusing the conventional recovery method and analyzing the output syndromes, IaSiG can determine and correct the defective TSVs. Results show that within a group, R redundant TSVs can fully localize and correct R defects and support the detection of R+1 defects. Moreover, by using G groups, it can localize up to GxR and detect up to G x (R + 1) defects. An implementation of IaSiG for 32-bit data in eight groups and two redundancies has a worst-case execution time (WCET) of 5,152 cycles while supporting at most 16 defective TSVs (50% localization). By integrating IaSiG onto a 3D Network-on-Chip, we also perform a grid-search based empirical method to insert suitable numbers of redundancies into TSV groups. The empirical method takes the operating temperature as the factor of accelerated fault due to the fact that temperature is one of the major issues of 3D-ICs. The results show that the proposed method can reduce the number of redundancies from the uniform method while still maintaining the required Mean Time to Failure. |
topic |
Fault-tolerance fault detection parity check through silicon via real-time thermal aware |
url |
https://ieeexplore.ieee.org/document/9189765/ |
work_keys_str_mv |
AT khanhndang athermalawareonlinefaulttolerancemethodfortsvlifetimereliabilityin3dnocsystems AT akrambenahmed athermalawareonlinefaulttolerancemethodfortsvlifetimereliabilityin3dnocsystems AT abderazekbenabdallah athermalawareonlinefaulttolerancemethodfortsvlifetimereliabilityin3dnocsystems AT xuantutran athermalawareonlinefaulttolerancemethodfortsvlifetimereliabilityin3dnocsystems AT khanhndang thermalawareonlinefaulttolerancemethodfortsvlifetimereliabilityin3dnocsystems AT akrambenahmed thermalawareonlinefaulttolerancemethodfortsvlifetimereliabilityin3dnocsystems AT abderazekbenabdallah thermalawareonlinefaulttolerancemethodfortsvlifetimereliabilityin3dnocsystems AT xuantutran thermalawareonlinefaulttolerancemethodfortsvlifetimereliabilityin3dnocsystems |
_version_ |
1724183473363615744 |