Computation and Communication Aware Task Graph Scheduling on Multi-GPGPU Systems

碩士 === 國立交通大學 === 電子工程學系 電子研究所 === 102 === Due to the massive parallel computation capability, GPGPUs have emerged as popular throughput computing platforms. Due to the astonishing computation capability, there is a growing interest in exploiting systems with multiple GPGPUs. However, attaining supe...

Full description

Bibliographic Details
Main Authors: Wang, Yun-Ting, 王允廷
Other Authors: Lai, Bo-Cheng
Format: Others
Language:en_US
Published: 2014
Online Access:http://ndltd.ncl.edu.tw/handle/71737217737762883446
id ndltd-TW-102NCTU5428163
record_format oai_dc
spelling ndltd-TW-102NCTU54281632015-10-14T00:18:37Z http://ndltd.ncl.edu.tw/handle/71737217737762883446 Computation and Communication Aware Task Graph Scheduling on Multi-GPGPU Systems 在多圖形處理器系統上考量運算負載與資料傳遞之工作排程方法 Wang, Yun-Ting 王允廷 碩士 國立交通大學 電子工程學系 電子研究所 102 Due to the massive parallel computation capability, GPGPUs have emerged as popular throughput computing platforms. Due to the astonishing computation capability, there is a growing interest in exploiting systems with multiple GPGPUs. However, attaining superior performance in a multi-GPGPU system involves three main design challenges. The first challenge is to balance the loading of tasks assigned to each GPGPU. An imbalanced loading across the system could cause idling of some GPGPUs and degrade the overall performance. The second is to exploit the memory resource by fully leveraging the data reuse between threads as well as kernels. Poor data reuse would cause excessive data accesses and transfers. The third challenge stems from how efficient a program could hide the data transfer overhead by overlapping the computation and communication [1]. This thesis aims at addressing the above design issues by proposing a Computation and Communication Aware task graph Scheduling (CCAS) for multi-GPGPU systems. The proposed scheduling approach (CCAS) adopts an effective heuristic algorithm that considers both the data reuse, and load balance to the performance of multi-GPGPU systems. In multi-graph applications, a pre-scan method is applied to cluster disjoint task graphs to each GPGPU based on the characteristics of the graph. In summary, the proposed CCAS approach can achieve an average of 22.15% performance enhancement when compared with a previous work. In multi-graph applications, the proposed pre-scan clustering method has achieved good performance scaling when the system size is increased from 2 to 4 GPGPUs. Lai, Bo-Cheng 賴伯承 2014 學位論文 ; thesis 56 en_US
collection NDLTD
language en_US
format Others
sources NDLTD
description 碩士 === 國立交通大學 === 電子工程學系 電子研究所 === 102 === Due to the massive parallel computation capability, GPGPUs have emerged as popular throughput computing platforms. Due to the astonishing computation capability, there is a growing interest in exploiting systems with multiple GPGPUs. However, attaining superior performance in a multi-GPGPU system involves three main design challenges. The first challenge is to balance the loading of tasks assigned to each GPGPU. An imbalanced loading across the system could cause idling of some GPGPUs and degrade the overall performance. The second is to exploit the memory resource by fully leveraging the data reuse between threads as well as kernels. Poor data reuse would cause excessive data accesses and transfers. The third challenge stems from how efficient a program could hide the data transfer overhead by overlapping the computation and communication [1]. This thesis aims at addressing the above design issues by proposing a Computation and Communication Aware task graph Scheduling (CCAS) for multi-GPGPU systems. The proposed scheduling approach (CCAS) adopts an effective heuristic algorithm that considers both the data reuse, and load balance to the performance of multi-GPGPU systems. In multi-graph applications, a pre-scan method is applied to cluster disjoint task graphs to each GPGPU based on the characteristics of the graph. In summary, the proposed CCAS approach can achieve an average of 22.15% performance enhancement when compared with a previous work. In multi-graph applications, the proposed pre-scan clustering method has achieved good performance scaling when the system size is increased from 2 to 4 GPGPUs.
author2 Lai, Bo-Cheng
author_facet Lai, Bo-Cheng
Wang, Yun-Ting
王允廷
author Wang, Yun-Ting
王允廷
spellingShingle Wang, Yun-Ting
王允廷
Computation and Communication Aware Task Graph Scheduling on Multi-GPGPU Systems
author_sort Wang, Yun-Ting
title Computation and Communication Aware Task Graph Scheduling on Multi-GPGPU Systems
title_short Computation and Communication Aware Task Graph Scheduling on Multi-GPGPU Systems
title_full Computation and Communication Aware Task Graph Scheduling on Multi-GPGPU Systems
title_fullStr Computation and Communication Aware Task Graph Scheduling on Multi-GPGPU Systems
title_full_unstemmed Computation and Communication Aware Task Graph Scheduling on Multi-GPGPU Systems
title_sort computation and communication aware task graph scheduling on multi-gpgpu systems
publishDate 2014
url http://ndltd.ncl.edu.tw/handle/71737217737762883446
work_keys_str_mv AT wangyunting computationandcommunicationawaretaskgraphschedulingonmultigpgpusystems
AT wángyǔntíng computationandcommunicationawaretaskgraphschedulingonmultigpgpusystems
AT wangyunting zàiduōtúxíngchùlǐqìxìtǒngshàngkǎoliàngyùnsuànfùzàiyǔzīliàochuándìzhīgōngzuòpáichéngfāngfǎ
AT wángyǔntíng zàiduōtúxíngchùlǐqìxìtǒngshàngkǎoliàngyùnsuànfùzàiyǔzīliàochuándìzhīgōngzuòpáichéngfāngfǎ
_version_ 1718088751186444288