Performance Optimization of Accelerators using C-based High-Level Synthesis Flow

碩士 === 國立清華大學 === 資訊工程學系 === 104 === High-level synthesis (HLS) has made significant progress in compiling high-level programs into register-transfer level (RTL) specifications. Memory partitioning in HLS can efficiently map data elements in the same logical array onto multiple physical banks. But...

Full description

Bibliographic Details
Main Authors: Tsai, Hsin Tien, 蔡欣恬
Other Authors: Huang, Chih Tsun
Format: Others
Language:en_US
Published: 2016
Online Access:http://ndltd.ncl.edu.tw/handle/10187876507311394693
Description
Summary:碩士 === 國立清華大學 === 資訊工程學系 === 104 === High-level synthesis (HLS) has made significant progress in compiling high-level programs into register-transfer level (RTL) specifications. Memory partitioning in HLS can efficiently map data elements in the same logical array onto multiple physical banks. But manually rewriting code is still necessary in order to obtain better quality of results in memory system optimization. In this thesis we provide a memory-remapping methodology to optimize the memory partitioning and the performance. We use Aladdin in our flow in order to quickly do the design space exploration and generate dynamic data dependence graphs (DDDG). Build the graphs with memory accesses, memory partitions, and scheduled cycles to illustrate the status of memory accesses after scheduling and move candidates on the graphs to reduce the total cycles and produce the better data placement in memory partitions. We optimize the code with the better data placement in memory partitions. Vivado HLS tool can generate different architectures of one design by applying different user configurations. For these user applied configurations, Vivado HLS uses a general way to implement so it may limits the behavior of the generated RTL design. For the limitations in Vivado HLS tool, we proposed three patches to break the limitations. After adding patches, these limitations in Vivado HLS can be solved. Experiment results on Vivado HLS show that our approach can optimize design with better total cycle and the results on Design Compiler also prove that after synthesis the design into gate-level, the optimized design generated by our methodology has better performance.