Accelerating Boolean Matching for Large Functions on CUDA Platform
碩士 === 輔仁大學 === 資訊工程學系碩士班 === 102 === Given two functions with the same number of input variables, Boolean matching is to check if they are equivalent or not under input permutation and input/output phase assignments. Our Boolean matching algorithm adopts an incremental learning approach which has t...
Main Authors: | , |
---|---|
Other Authors: | |
Format: | Others |
Language: | zh-TW |
Published: |
2014
|
Online Access: | http://ndltd.ncl.edu.tw/handle/65491311542595430699 |
id |
ndltd-TW-102FJU00396045 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-TW-102FJU003960452016-09-11T04:08:40Z http://ndltd.ncl.edu.tw/handle/65491311542595430699 Accelerating Boolean Matching for Large Functions on CUDA Platform 在CUDA平台上加速大型函數之布林比對 Feng-Ming Chang 張峰銘 碩士 輔仁大學 資訊工程學系碩士班 102 Given two functions with the same number of input variables, Boolean matching is to check if they are equivalent or not under input permutation and input/output phase assignments. Our Boolean matching algorithm adopts an incremental learning approach which has two phases, i.e., the learning phase and the applying phase. In the learning phase, it is to learn and store the knowledge acquired from the partial mappings computed so far. In the applying phase, it is to exploit the learnt knowledge to avoid the redundant computations and thus to speed up the searching process. Moreover, in our algorithm, signatures are used to shrink the solution space. Even though these techniques are applied in our method, it is still time-consuming for dealing with large functions. Recently, the architectural innovation of GPGPUs (General Purpose Graphics Processing Units) gains more attention by many researchers and it had been applied in solving many EDA (Electronic Design Automation) problems. In this thesis, firstly we analyze the incremental learning approach and then propose a weight computing method to accelerate the matching process. By our analysis, most of the running time is spent on two parts - AND procedure and Reduction procedure. Therefore, we parallelize these two procedures on CUDA (Compute Unified Device Architecture) platform. Directly parallelizing the above two procedures has three issues: (1) Using the original design of data structures results in the performance degradation on AND operations; (2) There is not enough GPU memory while matching large functions; (3) Unabling to implement memory coalescing leads to the reading GPU memory very inefficiently. To solve these problems, we propose three techniques to optimize the native parallel procedure, i.e., redesigning the data structures to improve the AND procedure performance, using CUDA stream technique to reduce the memory usage, and transforming the data storage to realize memory coalescing. By the experimental results, the weight computing method can acquire 2.6X performance improvement over the original one. Besides, we took experiments to verify whether our parallel matching algorithm can expedite the matching process or not for large Boolean functions with different input size. Compared to the sequential matching method, the native parallel method and our optimized parallel method can acquire 5X and 30X speed up, respectively. It shows that our optimized parallel Boolean matching algorithm is indeed effective and efficient for large Boolean functions. Kuo-Hua Wang 王國華 2014 學位論文 ; thesis 73 zh-TW |
collection |
NDLTD |
language |
zh-TW |
format |
Others
|
sources |
NDLTD |
description |
碩士 === 輔仁大學 === 資訊工程學系碩士班 === 102 === Given two functions with the same number of input variables, Boolean matching is to check if they are equivalent or not under input permutation and input/output phase assignments. Our Boolean matching algorithm adopts an incremental learning approach which has two phases, i.e., the learning phase and the applying phase. In the learning phase, it is to learn and store the knowledge acquired from the partial mappings computed so far. In the applying phase, it is to exploit the learnt knowledge to avoid the redundant computations and thus to speed up the searching process. Moreover, in our algorithm, signatures are used to shrink the solution space. Even though these techniques are applied in our method, it is still time-consuming for dealing with large functions. Recently, the architectural innovation of GPGPUs (General Purpose Graphics Processing Units) gains more attention by many researchers and it had been applied in solving many EDA (Electronic Design Automation) problems.
In this thesis, firstly we analyze the incremental learning approach and then propose a weight computing method to accelerate the matching process. By our analysis, most of the running time is spent on two parts - AND procedure and Reduction procedure. Therefore, we parallelize these two procedures on CUDA (Compute Unified Device Architecture) platform. Directly parallelizing the above two procedures has three issues: (1) Using the original design of data structures results in the performance degradation on AND operations; (2) There is not enough GPU memory while matching large functions; (3) Unabling to implement memory coalescing leads to the reading GPU memory very inefficiently. To solve these problems, we propose three techniques to optimize the native parallel procedure, i.e., redesigning the data structures to improve the AND procedure performance, using CUDA stream technique to reduce the memory usage, and transforming the data storage to realize memory coalescing.
By the experimental results, the weight computing method can acquire 2.6X performance improvement over the original one. Besides, we took experiments to verify whether our parallel matching algorithm can expedite the matching process or not for large Boolean functions with different input size. Compared to the sequential matching method, the native parallel method and our optimized parallel method can acquire 5X and 30X speed up, respectively. It shows that our optimized parallel Boolean matching algorithm is indeed effective and efficient for large Boolean functions.
|
author2 |
Kuo-Hua Wang |
author_facet |
Kuo-Hua Wang Feng-Ming Chang 張峰銘 |
author |
Feng-Ming Chang 張峰銘 |
spellingShingle |
Feng-Ming Chang 張峰銘 Accelerating Boolean Matching for Large Functions on CUDA Platform |
author_sort |
Feng-Ming Chang |
title |
Accelerating Boolean Matching for Large Functions on CUDA Platform |
title_short |
Accelerating Boolean Matching for Large Functions on CUDA Platform |
title_full |
Accelerating Boolean Matching for Large Functions on CUDA Platform |
title_fullStr |
Accelerating Boolean Matching for Large Functions on CUDA Platform |
title_full_unstemmed |
Accelerating Boolean Matching for Large Functions on CUDA Platform |
title_sort |
accelerating boolean matching for large functions on cuda platform |
publishDate |
2014 |
url |
http://ndltd.ncl.edu.tw/handle/65491311542595430699 |
work_keys_str_mv |
AT fengmingchang acceleratingbooleanmatchingforlargefunctionsoncudaplatform AT zhāngfēngmíng acceleratingbooleanmatchingforlargefunctionsoncudaplatform AT fengmingchang zàicudapíngtáishàngjiāsùdàxínghánshùzhībùlínbǐduì AT zhāngfēngmíng zàicudapíngtáishàngjiāsùdàxínghánshùzhībùlínbǐduì |
_version_ |
1718383120891248640 |