Accelerating Boolean Matching for Large Functions on CUDA Platform

碩士 === 輔仁大學 === 資訊工程學系碩士班 === 102 === Given two functions with the same number of input variables, Boolean matching is to check if they are equivalent or not under input permutation and input/output phase assignments. Our Boolean matching algorithm adopts an incremental learning approach which has t...

Full description

Bibliographic Details
Main Authors: Feng-Ming Chang, 張峰銘
Other Authors: Kuo-Hua Wang
Format: Others
Language:zh-TW
Published: 2014
Online Access:http://ndltd.ncl.edu.tw/handle/65491311542595430699
id ndltd-TW-102FJU00396045
record_format oai_dc
spelling ndltd-TW-102FJU003960452016-09-11T04:08:40Z http://ndltd.ncl.edu.tw/handle/65491311542595430699 Accelerating Boolean Matching for Large Functions on CUDA Platform 在CUDA平台上加速大型函數之布林比對 Feng-Ming Chang 張峰銘 碩士 輔仁大學 資訊工程學系碩士班 102 Given two functions with the same number of input variables, Boolean matching is to check if they are equivalent or not under input permutation and input/output phase assignments. Our Boolean matching algorithm adopts an incremental learning approach which has two phases, i.e., the learning phase and the applying phase. In the learning phase, it is to learn and store the knowledge acquired from the partial mappings computed so far. In the applying phase, it is to exploit the learnt knowledge to avoid the redundant computations and thus to speed up the searching process. Moreover, in our algorithm, signatures are used to shrink the solution space. Even though these techniques are applied in our method, it is still time-consuming for dealing with large functions. Recently, the architectural innovation of GPGPUs (General Purpose Graphics Processing Units) gains more attention by many researchers and it had been applied in solving many EDA (Electronic Design Automation) problems. In this thesis, firstly we analyze the incremental learning approach and then propose a weight computing method to accelerate the matching process. By our analysis, most of the running time is spent on two parts - AND procedure and Reduction procedure. Therefore, we parallelize these two procedures on CUDA (Compute Unified Device Architecture) platform. Directly parallelizing the above two procedures has three issues: (1) Using the original design of data structures results in the performance degradation on AND operations; (2) There is not enough GPU memory while matching large functions; (3) Unabling to implement memory coalescing leads to the reading GPU memory very inefficiently. To solve these problems, we propose three techniques to optimize the native parallel procedure, i.e., redesigning the data structures to improve the AND procedure performance, using CUDA stream technique to reduce the memory usage, and transforming the data storage to realize memory coalescing. By the experimental results, the weight computing method can acquire 2.6X performance improvement over the original one. Besides, we took experiments to verify whether our parallel matching algorithm can expedite the matching process or not for large Boolean functions with different input size. Compared to the sequential matching method, the native parallel method and our optimized parallel method can acquire 5X and 30X speed up, respectively. It shows that our optimized parallel Boolean matching algorithm is indeed effective and efficient for large Boolean functions. Kuo-Hua Wang 王國華 2014 學位論文 ; thesis 73 zh-TW
collection NDLTD
language zh-TW
format Others
sources NDLTD
description 碩士 === 輔仁大學 === 資訊工程學系碩士班 === 102 === Given two functions with the same number of input variables, Boolean matching is to check if they are equivalent or not under input permutation and input/output phase assignments. Our Boolean matching algorithm adopts an incremental learning approach which has two phases, i.e., the learning phase and the applying phase. In the learning phase, it is to learn and store the knowledge acquired from the partial mappings computed so far. In the applying phase, it is to exploit the learnt knowledge to avoid the redundant computations and thus to speed up the searching process. Moreover, in our algorithm, signatures are used to shrink the solution space. Even though these techniques are applied in our method, it is still time-consuming for dealing with large functions. Recently, the architectural innovation of GPGPUs (General Purpose Graphics Processing Units) gains more attention by many researchers and it had been applied in solving many EDA (Electronic Design Automation) problems. In this thesis, firstly we analyze the incremental learning approach and then propose a weight computing method to accelerate the matching process. By our analysis, most of the running time is spent on two parts - AND procedure and Reduction procedure. Therefore, we parallelize these two procedures on CUDA (Compute Unified Device Architecture) platform. Directly parallelizing the above two procedures has three issues: (1) Using the original design of data structures results in the performance degradation on AND operations; (2) There is not enough GPU memory while matching large functions; (3) Unabling to implement memory coalescing leads to the reading GPU memory very inefficiently. To solve these problems, we propose three techniques to optimize the native parallel procedure, i.e., redesigning the data structures to improve the AND procedure performance, using CUDA stream technique to reduce the memory usage, and transforming the data storage to realize memory coalescing. By the experimental results, the weight computing method can acquire 2.6X performance improvement over the original one. Besides, we took experiments to verify whether our parallel matching algorithm can expedite the matching process or not for large Boolean functions with different input size. Compared to the sequential matching method, the native parallel method and our optimized parallel method can acquire 5X and 30X speed up, respectively. It shows that our optimized parallel Boolean matching algorithm is indeed effective and efficient for large Boolean functions.
author2 Kuo-Hua Wang
author_facet Kuo-Hua Wang
Feng-Ming Chang
張峰銘
author Feng-Ming Chang
張峰銘
spellingShingle Feng-Ming Chang
張峰銘
Accelerating Boolean Matching for Large Functions on CUDA Platform
author_sort Feng-Ming Chang
title Accelerating Boolean Matching for Large Functions on CUDA Platform
title_short Accelerating Boolean Matching for Large Functions on CUDA Platform
title_full Accelerating Boolean Matching for Large Functions on CUDA Platform
title_fullStr Accelerating Boolean Matching for Large Functions on CUDA Platform
title_full_unstemmed Accelerating Boolean Matching for Large Functions on CUDA Platform
title_sort accelerating boolean matching for large functions on cuda platform
publishDate 2014
url http://ndltd.ncl.edu.tw/handle/65491311542595430699
work_keys_str_mv AT fengmingchang acceleratingbooleanmatchingforlargefunctionsoncudaplatform
AT zhāngfēngmíng acceleratingbooleanmatchingforlargefunctionsoncudaplatform
AT fengmingchang zàicudapíngtáishàngjiāsùdàxínghánshùzhībùlínbǐduì
AT zhāngfēngmíng zàicudapíngtáishàngjiāsùdàxínghánshùzhībùlínbǐduì
_version_ 1718383120891248640