Adaptive Heterogeneous Computing using Runtime Dispatching and Translation Techniques
碩士 === 國立交通大學 === 資訊科學與工程研究所 === 101 === This thesis is to quip adaptability for heterogeneous computing in OpenCL framework. The implementation is that first we obtain the statuses of devices at runtime and then determine on which device kernels execute. If the kernel is dispatched into CPU, in ord...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | en_US |
Published: |
2013
|
Online Access: | http://ndltd.ncl.edu.tw/handle/36023886441426475476 |
id |
ndltd-TW-101NCTU5394153 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-101NCTU53941532016-07-02T04:20:29Z http://ndltd.ncl.edu.tw/handle/36023886441426475476 Adaptive Heterogeneous Computing using Runtime Dispatching and Translation Techniques 使用執行期配置與轉譯技術達成可適性異質多核心計算 Tsai, Yi-Pu 蔡怡璞 碩士 國立交通大學 資訊科學與工程研究所 101 This thesis is to quip adaptability for heterogeneous computing in OpenCL framework. The implementation is that first we obtain the statuses of devices at runtime and then determine on which device kernels execute. If the kernel is dispatched into CPU, in order to enhance the performance on CPU, it will be translated to SIMD instructions by the translation techniques provided by Whole Function Vectorization. Moreover, for the sake of more intelligent dispatch, we utilized the profile history to predict the loads of devices. With the improvement, experiments indicate that the average improvement of execution time is 65% on the samples provided by AMD APP SDK. Hsu, Wei-Chung 徐慰中 2013 學位論文 ; thesis 28 en_US |
collection |
NDLTD |
language |
en_US |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 國立交通大學 === 資訊科學與工程研究所 === 101 === This thesis is to quip adaptability for heterogeneous computing in OpenCL framework. The implementation is that first we obtain the statuses of devices at runtime and then determine on which device kernels execute. If the kernel is dispatched into CPU, in order to enhance the performance on CPU, it will be translated to SIMD instructions by the translation techniques provided by Whole Function Vectorization. Moreover, for the sake of more intelligent dispatch, we utilized the profile history to predict the loads of devices. With the improvement, experiments indicate that the average improvement of execution time is 65% on the samples provided by AMD APP SDK.
|
author2 |
Hsu, Wei-Chung |
author_facet |
Hsu, Wei-Chung Tsai, Yi-Pu 蔡怡璞 |
author |
Tsai, Yi-Pu 蔡怡璞 |
spellingShingle |
Tsai, Yi-Pu 蔡怡璞 Adaptive Heterogeneous Computing using Runtime Dispatching and Translation Techniques |
author_sort |
Tsai, Yi-Pu |
title |
Adaptive Heterogeneous Computing using Runtime Dispatching and Translation Techniques |
title_short |
Adaptive Heterogeneous Computing using Runtime Dispatching and Translation Techniques |
title_full |
Adaptive Heterogeneous Computing using Runtime Dispatching and Translation Techniques |
title_fullStr |
Adaptive Heterogeneous Computing using Runtime Dispatching and Translation Techniques |
title_full_unstemmed |
Adaptive Heterogeneous Computing using Runtime Dispatching and Translation Techniques |
title_sort |
adaptive heterogeneous computing using runtime dispatching and translation techniques |
publishDate |
2013 |
url |
http://ndltd.ncl.edu.tw/handle/36023886441426475476 |
work_keys_str_mv |
AT tsaiyipu adaptiveheterogeneouscomputingusingruntimedispatchingandtranslationtechniques AT càiyípú adaptiveheterogeneouscomputingusingruntimedispatchingandtranslationtechniques AT tsaiyipu shǐyòngzhíxíngqīpèizhìyǔzhuǎnyìjìshùdáchéngkěshìxìngyìzhìduōhéxīnjìsuàn AT càiyípú shǐyòngzhíxíngqīpèizhìyǔzhuǎnyìjìshùdáchéngkěshìxìngyìzhìduōhéxīnjìsuàn |
_version_ |
1718331500284346368 |