Optimization of Workgroup Scheduling on CASLAB-GPUSIM
碩士 === 國立成功大學 === 電腦與通信工程研究所 === 105 === General Purpose Graphics Processing Units (GPGPUs) become more and more important in recent years. We develop CASLAB-GPUSIM, a GPGPU simulation platform based on single instruction multiple thread acrchitecture by SystemC. The platform also includes the memor...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | zh-TW |
Published: |
2017
|
Online Access: | http://ndltd.ncl.edu.tw/handle/46t7ed |
id |
ndltd-TW-105NCKU5652054 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-105NCKU56520542019-05-15T23:47:01Z http://ndltd.ncl.edu.tw/handle/46t7ed Optimization of Workgroup Scheduling on CASLAB-GPUSIM 繪圖處理器之執行緒區塊排程優化與其在CASLAB-GPUSIM上之實現 Sen-ChihTsai 蔡森至 碩士 國立成功大學 電腦與通信工程研究所 105 General Purpose Graphics Processing Units (GPGPUs) become more and more important in recent years. We develop CASLAB-GPUSIM, a GPGPU simulation platform based on single instruction multiple thread acrchitecture by SystemC. The platform also includes the memory subsystem and the software toolchain, and is verified with benchmarks from Rodinia, AMD and NVIDIA. This paper explores the problems of performance by workgroup scheduling and warp scheduling on CASLAB-GPUSIM. There are two methods proposed. The first is KWS, a kernel aware warp scheduler, which has to be used with mixed concurrent kernel execution. KWS prioritizes the warps by the attribution of kernel and the type of instructions to ease the problem of the imbalance of kernel workload and hardware resources. The second is PBWS, a profiling based workgroup scheduler, which restricts the maximum number of workgroups allocated to the streaming multiprocessors. PBWS miligates the problem of the imbalance of the memory requests from kernel and the memory subsystem. The mechanisms are implemented in CASLAB-GPUSIM and are evaluated with the benchmarks. KWS with mixed concurrent kernel execution yields 20% speedup compared to traditional concurrent kernel execution with Loose Round-Robin warp scheduler. PBWS yields 11% speedup compared to Round-Robin workgroup scheduler. Chung-Ho Chen 陳中和 2017 學位論文 ; thesis 60 zh-TW |
collection |
NDLTD |
language |
zh-TW |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立成功大學 === 電腦與通信工程研究所 === 105 === General Purpose Graphics Processing Units (GPGPUs) become more and more important in recent years. We develop CASLAB-GPUSIM, a GPGPU simulation platform based on single instruction multiple thread acrchitecture by SystemC. The platform also includes the memory subsystem and the software toolchain, and is verified with benchmarks from Rodinia, AMD and NVIDIA.
This paper explores the problems of performance by workgroup scheduling and warp scheduling on CASLAB-GPUSIM. There are two methods proposed. The first is KWS, a kernel aware warp scheduler, which has to be used with mixed concurrent kernel execution. KWS prioritizes the warps by the attribution of kernel and the type of instructions to ease the problem of the imbalance of kernel workload and hardware resources. The second is PBWS, a profiling based workgroup scheduler, which restricts the maximum number of workgroups allocated to the streaming multiprocessors. PBWS miligates the problem of the imbalance of the memory requests from kernel and the memory subsystem. The mechanisms are implemented in CASLAB-GPUSIM and are evaluated with the benchmarks. KWS with mixed concurrent kernel execution yields 20% speedup compared to traditional concurrent kernel execution with Loose Round-Robin warp scheduler. PBWS yields 11% speedup compared to Round-Robin workgroup scheduler.
|
author2 |
Chung-Ho Chen |
author_facet |
Chung-Ho Chen Sen-ChihTsai 蔡森至 |
author |
Sen-ChihTsai 蔡森至 |
spellingShingle |
Sen-ChihTsai 蔡森至 Optimization of Workgroup Scheduling on CASLAB-GPUSIM |
author_sort |
Sen-ChihTsai |
title |
Optimization of Workgroup Scheduling on CASLAB-GPUSIM |
title_short |
Optimization of Workgroup Scheduling on CASLAB-GPUSIM |
title_full |
Optimization of Workgroup Scheduling on CASLAB-GPUSIM |
title_fullStr |
Optimization of Workgroup Scheduling on CASLAB-GPUSIM |
title_full_unstemmed |
Optimization of Workgroup Scheduling on CASLAB-GPUSIM |
title_sort |
optimization of workgroup scheduling on caslab-gpusim |
publishDate |
2017 |
url |
http://ndltd.ncl.edu.tw/handle/46t7ed |
work_keys_str_mv |
AT senchihtsai optimizationofworkgroupschedulingoncaslabgpusim AT càisēnzhì optimizationofworkgroupschedulingoncaslabgpusim AT senchihtsai huìtúchùlǐqìzhīzhíxíngxùqūkuàipáichéngyōuhuàyǔqízàicaslabgpusimshàngzhīshíxiàn AT càisēnzhì huìtúchùlǐqìzhīzhíxíngxùqūkuàipáichéngyōuhuàyǔqízàicaslabgpusimshàngzhīshíxiàn |
_version_ |
1719154985100378112 |