Computation and Communication Aware Task Graph Scheduling on Multi-GPGPU Systems

碩士 === 國立交通大學 === 電子工程學系電子研究所 === 102 === Due to the massive parallel computation capability, GPGPUs have emerged as popular throughput computing platforms. Due to the astonishing computation capability, there is a growing interest in exploiting systems with multiple GPGPUs. However, attaining supe...

Full description

Bibliographic Details
Main Authors:	Wang, Yun-Ting, 王允廷
Other Authors:	Lai, Bo-Cheng
Format:	Others
Language:	en_US
Published:	2014
Online Access:	http://ndltd.ncl.edu.tw/handle/71737217737762883446

id	ndltd-TW-102NCTU5428163
record_format	oai_dc
spelling	ndltd-TW-102NCTU54281632015-10-14T00:18:37Z http://ndltd.ncl.edu.tw/handle/71737217737762883446 Computation and Communication Aware Task Graph Scheduling on Multi-GPGPU Systems 在多圖形處理器系統上考量運算負載與資料傳遞之工作排程方法 Wang, Yun-Ting 王允廷碩士國立交通大學電子工程學系電子研究所 102 Due to the massive parallel computation capability, GPGPUs have emerged as popular throughput computing platforms. Due to the astonishing computation capability, there is a growing interest in exploiting systems with multiple GPGPUs. However, attaining superior performance in a multi-GPGPU system involves three main design challenges. The first challenge is to balance the loading of tasks assigned to each GPGPU. An imbalanced loading across the system could cause idling of some GPGPUs and degrade the overall performance. The second is to exploit the memory resource by fully leveraging the data reuse between threads as well as kernels. Poor data reuse would cause excessive data accesses and transfers. The third challenge stems from how efficient a program could hide the data transfer overhead by overlapping the computation and communication [1]. This thesis aims at addressing the above design issues by proposing a Computation and Communication Aware task graph Scheduling (CCAS) for multi-GPGPU systems. The proposed scheduling approach (CCAS) adopts an effective heuristic algorithm that considers both the data reuse, and load balance to the performance of multi-GPGPU systems. In multi-graph applications, a pre-scan method is applied to cluster disjoint task graphs to each GPGPU based on the characteristics of the graph. In summary, the proposed CCAS approach can achieve an average of 22.15% performance enhancement when compared with a previous work. In multi-graph applications, the proposed pre-scan clustering method has achieved good performance scaling when the system size is increased from 2 to 4 GPGPUs. Lai, Bo-Cheng 賴伯承 2014 學位論文 ; thesis 56 en_US
collection	NDLTD
language	en_US
format	Others
sources	NDLTD
description	碩士 === 國立交通大學 === 電子工程學系電子研究所 === 102 === Due to the massive parallel computation capability, GPGPUs have emerged as popular throughput computing platforms. Due to the astonishing computation capability, there is a growing interest in exploiting systems with multiple GPGPUs. However, attaining superior performance in a multi-GPGPU system involves three main design challenges. The first challenge is to balance the loading of tasks assigned to each GPGPU. An imbalanced loading across the system could cause idling of some GPGPUs and degrade the overall performance. The second is to exploit the memory resource by fully leveraging the data reuse between threads as well as kernels. Poor data reuse would cause excessive data accesses and transfers. The third challenge stems from how efficient a program could hide the data transfer overhead by overlapping the computation and communication [1]. This thesis aims at addressing the above design issues by proposing a Computation and Communication Aware task graph Scheduling (CCAS) for multi-GPGPU systems. The proposed scheduling approach (CCAS) adopts an effective heuristic algorithm that considers both the data reuse, and load balance to the performance of multi-GPGPU systems. In multi-graph applications, a pre-scan method is applied to cluster disjoint task graphs to each GPGPU based on the characteristics of the graph. In summary, the proposed CCAS approach can achieve an average of 22.15% performance enhancement when compared with a previous work. In multi-graph applications, the proposed pre-scan clustering method has achieved good performance scaling when the system size is increased from 2 to 4 GPGPUs.
author2	Lai, Bo-Cheng
author_facet	Lai, Bo-Cheng Wang, Yun-Ting 王允廷
author	Wang, Yun-Ting 王允廷
spellingShingle	Wang, Yun-Ting 王允廷 Computation and Communication Aware Task Graph Scheduling on Multi-GPGPU Systems
author_sort	Wang, Yun-Ting
title	Computation and Communication Aware Task Graph Scheduling on Multi-GPGPU Systems
title_short	Computation and Communication Aware Task Graph Scheduling on Multi-GPGPU Systems
title_full	Computation and Communication Aware Task Graph Scheduling on Multi-GPGPU Systems
title_fullStr	Computation and Communication Aware Task Graph Scheduling on Multi-GPGPU Systems
title_full_unstemmed	Computation and Communication Aware Task Graph Scheduling on Multi-GPGPU Systems
title_sort	computation and communication aware task graph scheduling on multi-gpgpu systems
publishDate	2014
url	http://ndltd.ncl.edu.tw/handle/71737217737762883446
work_keys_str_mv	AT wangyunting computationandcommunicationawaretaskgraphschedulingonmultigpgpusystems AT wángyǔntíng computationandcommunicationawaretaskgraphschedulingonmultigpgpusystems AT wangyunting zàiduōtúxíngchùlǐqìxìtǒngshàngkǎoliàngyùnsuànfùzàiyǔzīliàochuándìzhīgōngzuòpáichéngfāngfǎ AT wángyǔntíng zàiduōtúxíngchùlǐqìxìtǒngshàngkǎoliàngyùnsuànfùzàiyǔzīliàochuándìzhīgōngzuòpáichéngfāngfǎ
_version_	1718088751186444288

Computation and Communication Aware Task Graph Scheduling on Multi-GPGPU Systems

Similar Items