A Basis Theory for Loop Parallelization

博士 === 國立臺灣大學 === 資訊工程研究所 === 84 === Parallelism extraction, iteration partitioning, data partitioning and scheduling are the most important issues in parallelizing compilers. Parallelism extraction is to find parallelizable computations. Iteration partit...

Full description

Bibliographic Details
Main Authors: Liu,Li, 劉立
Other Authors: Lin,Ferng-Ching
Format: Others
Language:zh-TW
Published: 1995
Online Access:http://ndltd.ncl.edu.tw/handle/71089128535342433703
Description
Summary:博士 === 國立臺灣大學 === 資訊工程研究所 === 84 === Parallelism extraction, iteration partitioning, data partitioning and scheduling are the most important issues in parallelizing compilers. Parallelism extraction is to find parallelizable computations. Iteration partitioning concerns maximizing the number of independent partitions. Data partitioning tries to group data used by iterations with dependences to reduce the communications. Loop scheduling is to synchronize the dependent iterations. Since all these works are related to dependence vectors, how to manipulate them becomes the key problem in designing parallelizing compilers. We produce a proper basis, called pc-basis, for the space spanned by the dependence vectors. The desired loop transformations for different purposes can then be systematically derived based on the pc-basis. We use pc-basis to construct unimodular transformations for extracting maximum outer and inner loop parallelism by systematically combining loop reversal, loop skewing and loop interchange. We also use pc-basis to construct left-side and right-side unimodular transformations to find maximal independent partitioning. For treating iteration partitioning and data partitioning together, we transform a loop program into three layers of loops to suit different needs of communications and synchronizations. For loop synchronization, pc-basis is used to construct an optimal set of synchronization vectors. All the above methods can be automated. We implement our outer loop parallelization and three-layer loop parallelization in Parafrase-2.