Compilers for VLIW DSP Architectures with Distributed and Irregular Designs

博士 === 國立清華大學 === 資訊工程學系 === 95 === VLIW architectures have already been the main-stream design for a modern high-end processor in recent years to support more instruction-level-parallelism (ILP) and potential performance than the traditional single-issue CISC/RISC machines. Due to the advances in V...

Full description

Bibliographic Details
Main Authors: Yung-Chia Lin, 林永嘉
Other Authors: Jenq Kuen Lee
Format: Others
Language:en_US
Published: 2007
Online Access:http://ndltd.ncl.edu.tw/handle/57626100213633979560
id ndltd-TW-095NTHU5392032
record_format oai_dc
collection NDLTD
language en_US
format Others
sources NDLTD
description 博士 === 國立清華大學 === 資訊工程學系 === 95 === VLIW architectures have already been the main-stream design for a modern high-end processor in recent years to support more instruction-level-parallelism (ILP) and potential performance than the traditional single-issue CISC/RISC machines. Due to the advances in VLSI technology, people nowadays could develop more powerful and faster chips than ever, but also get additional issues to be considered while designing a new VLIW processor: complexity, die size, and power dissipation. For the embedded-system market, a successful processor design not only requires to provide ample performance but features low-power consumption, low cost, and reduced time-to-market. Therefore, some popular, fancy and sophisticated design techniques to enhance the performance of a general-purpose VLIW processor may not be feasible for an embedded processor that also demands a high performance criterion. Wide varieties of register file architectures and irregular designs --- developed for embedded processors --- have turned to aim at reducing the complexity, power dissipation, and die size these years, by contrast with the traditional architectures implemented by high-performance processors. There has been considerable interest in developing the techniques to effectively support the code generation and optimizations for such architectures with irregular designs because the compiler is generally regarded as the most important system-software component that supports a processor design to achieve success. It is also essential to have adequate compiler support for VLIW architectures so that the programming efficiency could be dramatically improved. This dissertation has made contributions to the design and development of an effective compiler for a novel VLIW DSP with irregular designs. The target DSP architecture, known as the PAC DSP core, is designed with distinctively partitioned register files in which port access is highly restricted. Moreover, the PAC DSP utilizes a heterogeneous distributed data-path architecture to attain an efficient design with low complexity, small size, and the possible low power consumption. We believe that the PAC DSP employs a promising architecture model to pragmatically support the high parallelism demanded by the DSP applications but reduce the disadvantageous progress of chip complexity, die size, and power dissipation. Our experiences in designing the compiler support for the PAC DSP may also be of interest to those involved in developing compilers for the similar architectures with such irregular designs. Our major contributions in this dissertation are as follows: 1. We present our application of the Open64/ORC infrastructure to a novel VLIW DSP and the specific design for handling its register file architecture. As part of an effort to overcome the new challenges of code generation for the PAC DSP, we have developed a new register allocation framework and other retargeting optimization phases that allow the effective generation of %support in generating high quality code. 2. We propose a novel heuristic algorithm, named ping-pong aware local favorable (PALF) register allocation, to obtain advantageous register allocation that is expected to better utilize irregular register file architectures. We also propose an alternate register allocation scheme using a simulated-annealing (SA) approach, and a hybrid optimization procedure to integrate the PALF and SA. Furthermore, an associated global register allocation strategy is presented and discussed. 3. Advanced subjects to support generating optimized code for PAC DSP architectures are also discussed in this dissertation and preliminarily developed in our compilation infrastructure. The results of all experiments performed using our optimizing compiler based on the Open Research Compiler (Open64/ORC), showed significant performance improvement over the primitive code generation. Our preliminary experimental results also indicate that our developed compiler can efficiently utilize the features of the specific register file architectures and irregular designs in the PAC DSP.
author2 Jenq Kuen Lee
author_facet Jenq Kuen Lee
Yung-Chia Lin
林永嘉
author Yung-Chia Lin
林永嘉
spellingShingle Yung-Chia Lin
林永嘉
Compilers for VLIW DSP Architectures with Distributed and Irregular Designs
author_sort Yung-Chia Lin
title Compilers for VLIW DSP Architectures with Distributed and Irregular Designs
title_short Compilers for VLIW DSP Architectures with Distributed and Irregular Designs
title_full Compilers for VLIW DSP Architectures with Distributed and Irregular Designs
title_fullStr Compilers for VLIW DSP Architectures with Distributed and Irregular Designs
title_full_unstemmed Compilers for VLIW DSP Architectures with Distributed and Irregular Designs
title_sort compilers for vliw dsp architectures with distributed and irregular designs
publishDate 2007
url http://ndltd.ncl.edu.tw/handle/57626100213633979560
work_keys_str_mv AT yungchialin compilersforvliwdsparchitectureswithdistributedandirregulardesigns
AT línyǒngjiā compilersforvliwdsparchitectureswithdistributedandirregulardesigns
AT yungchialin jùfēnsànshìjífēizhèngguīshèjìzhīchāozhǎngzhǐlìngjíshùwèixùnhàochùlǐqìjiàgòuzhībiānyìqìshèjìyǔzuìjiāhuàyánjiū
AT línyǒngjiā jùfēnsànshìjífēizhèngguīshèjìzhīchāozhǎngzhǐlìngjíshùwèixùnhàochùlǐqìjiàgòuzhībiānyìqìshèjìyǔzuìjiāhuàyánjiū
_version_ 1717775370253500416
spelling ndltd-TW-095NTHU53920322015-10-13T16:51:13Z http://ndltd.ncl.edu.tw/handle/57626100213633979560 Compilers for VLIW DSP Architectures with Distributed and Irregular Designs 具分散式及非正規設計之超長指令集數位訊號處理器架構之編譯器設計與最佳化研究 Yung-Chia Lin 林永嘉 博士 國立清華大學 資訊工程學系 95 VLIW architectures have already been the main-stream design for a modern high-end processor in recent years to support more instruction-level-parallelism (ILP) and potential performance than the traditional single-issue CISC/RISC machines. Due to the advances in VLSI technology, people nowadays could develop more powerful and faster chips than ever, but also get additional issues to be considered while designing a new VLIW processor: complexity, die size, and power dissipation. For the embedded-system market, a successful processor design not only requires to provide ample performance but features low-power consumption, low cost, and reduced time-to-market. Therefore, some popular, fancy and sophisticated design techniques to enhance the performance of a general-purpose VLIW processor may not be feasible for an embedded processor that also demands a high performance criterion. Wide varieties of register file architectures and irregular designs --- developed for embedded processors --- have turned to aim at reducing the complexity, power dissipation, and die size these years, by contrast with the traditional architectures implemented by high-performance processors. There has been considerable interest in developing the techniques to effectively support the code generation and optimizations for such architectures with irregular designs because the compiler is generally regarded as the most important system-software component that supports a processor design to achieve success. It is also essential to have adequate compiler support for VLIW architectures so that the programming efficiency could be dramatically improved. This dissertation has made contributions to the design and development of an effective compiler for a novel VLIW DSP with irregular designs. The target DSP architecture, known as the PAC DSP core, is designed with distinctively partitioned register files in which port access is highly restricted. Moreover, the PAC DSP utilizes a heterogeneous distributed data-path architecture to attain an efficient design with low complexity, small size, and the possible low power consumption. We believe that the PAC DSP employs a promising architecture model to pragmatically support the high parallelism demanded by the DSP applications but reduce the disadvantageous progress of chip complexity, die size, and power dissipation. Our experiences in designing the compiler support for the PAC DSP may also be of interest to those involved in developing compilers for the similar architectures with such irregular designs. Our major contributions in this dissertation are as follows: 1. We present our application of the Open64/ORC infrastructure to a novel VLIW DSP and the specific design for handling its register file architecture. As part of an effort to overcome the new challenges of code generation for the PAC DSP, we have developed a new register allocation framework and other retargeting optimization phases that allow the effective generation of %support in generating high quality code. 2. We propose a novel heuristic algorithm, named ping-pong aware local favorable (PALF) register allocation, to obtain advantageous register allocation that is expected to better utilize irregular register file architectures. We also propose an alternate register allocation scheme using a simulated-annealing (SA) approach, and a hybrid optimization procedure to integrate the PALF and SA. Furthermore, an associated global register allocation strategy is presented and discussed. 3. Advanced subjects to support generating optimized code for PAC DSP architectures are also discussed in this dissertation and preliminarily developed in our compilation infrastructure. The results of all experiments performed using our optimizing compiler based on the Open Research Compiler (Open64/ORC), showed significant performance improvement over the primitive code generation. Our preliminary experimental results also indicate that our developed compiler can efficiently utilize the features of the specific register file architectures and irregular designs in the PAC DSP. Jenq Kuen Lee 李政崑 2007 學位論文 ; thesis 92 en_US