A High-accuracy and Cost-effective SFP SUMMA Array Processor for CNN Inference Application

碩士 === 國立交通大學 === 電子研究所 === 106 === We propose a high-accuracy and cost-effective array processor for Deep Convolution Neural Network (DCNN) inference application. The proposed Static Floating-Point (SFP) arithmetic allows the MAC operations operated on non-zeros bits of data. This will guarantee th...

Full description

Bibliographic Details
Main Authors: Li, Chi-Jiun, 李其駿
Other Authors: Liu, Chih-Wei
Format: Others
Language:en_US
Published: 2017
Online Access:http://ndltd.ncl.edu.tw/handle/5g73p2
Description
Summary:碩士 === 國立交通大學 === 電子研究所 === 106 === We propose a high-accuracy and cost-effective array processor for Deep Convolution Neural Network (DCNN) inference application. The proposed Static Floating-Point (SFP) arithmetic allows the MAC operations operated on non-zeros bits of data. This will guarantee the energy efficiency as well as the accuracy of the proposed computing engine. Moreover, applying scalable universal matrix multiplication algorithm (SUMMA), we avoid storing repeated data in the local storage, and data can be broadcasted to corresponding PEs. With the proposed simple stream interface unit (SIU), the proposed design can greatly reduce the access frequency of operands (data or weights) being read/written from/to the central register file (CRF), and minimize the power consumption. Simulation results reveal that the proposed SFP SUMMA array processor can achieve approximately 56.47% top-1 accuracy performance and consume only 167mW. Synthesized by TSMC 90 nm CMOS technology, the proposed SFP SUMMA DIP achieves 0.45 TOPs/W. On the contrary, performing the same work load of the 5 convolutional layers within Alexnet, the performance of MIT Eyeriss is only 0.3 TOPs/W (@65 nm CMOS).