Auto-Tuning for GPGPU Applications Using Performance and Energy Model

碩士 === 國立中正大學 === 資訊工程研究所 === 101 === The graphic processing unit (GPU) is a popular accelerator for image processing systems because the algorithms are inherently and massively parallel and the workloads are divisible. GPU was further enhanced into General-Purpose GPU (GPGPU) which can be used for...

Full description

Bibliographic Details
Main Authors: Shih-Meng Teng, 鄧世孟
Other Authors: Pao-Ann Hsiung
Format: Others
Language:en_US
Published: 2013
Online Access:http://ndltd.ncl.edu.tw/handle/44430574859489904355
id ndltd-TW-100CCU00392094
record_format oai_dc
spelling ndltd-TW-100CCU003920942015-10-13T22:23:53Z http://ndltd.ncl.edu.tw/handle/44430574859489904355 Auto-Tuning for GPGPU Applications Using Performance and Energy Model 針對GPGPU應用使用效能與能源模型之自動化調校參數之最佳化方法 Shih-Meng Teng 鄧世孟 碩士 國立中正大學 資訊工程研究所 101 The graphic processing unit (GPU) is a popular accelerator for image processing systems because the algorithms are inherently and massively parallel and the workloads are divisible. GPU was further enhanced into General-Purpose GPU (GPGPU) which can be used for accelerating general applications such as scientific computations through parallelization. Several of the Top 500 list of supercomputers and the Green 500 list of power-efficient computers include millions of GPGPU devices. Distributing workload among the large number of cores in a GPGPU is currently still mostly a manual trial-and-error process. Programmers try out manually some configurations (distributions of workload) and might settle for a sub-optimal one leading to poor performance and/or high power consumption. The state-of-the-art methods for addressing this issue are mainly based on profiling of computation kernels. The work presented in this Thesis has two benefits. First, it proposes a model-based analytic approach for estimating performance and power consumption of kernels such that the estimated values have a high average fidelity. Second, an auto-tuning framework is proposed for automatically obtaining a near-optimal configuration for a kernel computation such that either of the following two optimizations can be performed: (a) the kernel's execution time is almost minimal while satisfying user-given upper bound on energy consumption, (b) the kernel's energy consumption is almost minimal while satisfying user-given upper bound on execution time. The proposed framework formulated the problem as an optimization and solved it using either simulated annealing (SA) or genetic algorithm (GA). Experiment results show that the fidelity of the proposed models for performance and energy consumption are 0.86 and 0.89, respectively. Further, the optimization algorithms result in a normalized optimality offset of 0.94\% and 0.79\% for SA and GA, respectively. Pao-Ann Hsiung 熊博安 2013 學位論文 ; thesis 87 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 碩士 === 國立中正大學 === 資訊工程研究所 === 101 === The graphic processing unit (GPU) is a popular accelerator for image processing systems because the algorithms are inherently and massively parallel and the workloads are divisible. GPU was further enhanced into General-Purpose GPU (GPGPU) which can be used for accelerating general applications such as scientific computations through parallelization. Several of the Top 500 list of supercomputers and the Green 500 list of power-efficient computers include millions of GPGPU devices. Distributing workload among the large number of cores in a GPGPU is currently still mostly a manual trial-and-error process. Programmers try out manually some configurations (distributions of workload) and might settle for a sub-optimal one leading to poor performance and/or high power consumption. The state-of-the-art methods for addressing this issue are mainly based on profiling of computation kernels. The work presented in this Thesis has two benefits. First, it proposes a model-based analytic approach for estimating performance and power consumption of kernels such that the estimated values have a high average fidelity. Second, an auto-tuning framework is proposed for automatically obtaining a near-optimal configuration for a kernel computation such that either of the following two optimizations can be performed: (a) the kernel's execution time is almost minimal while satisfying user-given upper bound on energy consumption, (b) the kernel's energy consumption is almost minimal while satisfying user-given upper bound on execution time. The proposed framework formulated the problem as an optimization and solved it using either simulated annealing (SA) or genetic algorithm (GA). Experiment results show that the fidelity of the proposed models for performance and energy consumption are 0.86 and 0.89, respectively. Further, the optimization algorithms result in a normalized optimality offset of 0.94\% and 0.79\% for SA and GA, respectively.
author2 Pao-Ann Hsiung
author_facet Pao-Ann Hsiung
Shih-Meng Teng
鄧世孟
author Shih-Meng Teng
鄧世孟
spellingShingle Shih-Meng Teng
鄧世孟
Auto-Tuning for GPGPU Applications Using Performance and Energy Model
author_sort Shih-Meng Teng
title Auto-Tuning for GPGPU Applications Using Performance and Energy Model
title_short Auto-Tuning for GPGPU Applications Using Performance and Energy Model
title_full Auto-Tuning for GPGPU Applications Using Performance and Energy Model
title_fullStr Auto-Tuning for GPGPU Applications Using Performance and Energy Model
title_full_unstemmed Auto-Tuning for GPGPU Applications Using Performance and Energy Model
title_sort auto-tuning for gpgpu applications using performance and energy model
publishDate 2013
url http://ndltd.ncl.edu.tw/handle/44430574859489904355
work_keys_str_mv AT shihmengteng autotuningforgpgpuapplicationsusingperformanceandenergymodel
AT dèngshìmèng autotuningforgpgpuapplicationsusingperformanceandenergymodel
AT shihmengteng zhēnduìgpgpuyīngyòngshǐyòngxiàonéngyǔnéngyuánmóxíngzhīzìdònghuàdiàoxiàocānshùzhīzuìjiāhuàfāngfǎ
AT dèngshìmèng zhēnduìgpgpuyīngyòngshǐyòngxiàonéngyǔnéngyuánmóxíngzhīzìdònghuàdiàoxiàocānshùzhīzuìjiāhuàfāngfǎ
_version_ 1718075378678890496