Accelerating Boolean Matching for Large Functions on CUDA Platform

碩士 === 輔仁大學 === 資訊工程學系碩士班 === 102 === Given two functions with the same number of input variables, Boolean matching is to check if they are equivalent or not under input permutation and input/output phase assignments. Our Boolean matching algorithm adopts an incremental learning approach which has t...

Full description

Bibliographic Details
Main Authors:	Feng-Ming Chang, 張峰銘
Other Authors:	Kuo-Hua Wang
Format:	Others
Language:	zh-TW
Published:	2014
Online Access:	http://ndltd.ncl.edu.tw/handle/65491311542595430699

id	ndltd-TW-102FJU00396045
record_format	oai_dc
spelling	ndltd-TW-102FJU003960452016-09-11T04:08:40Z http://ndltd.ncl.edu.tw/handle/65491311542595430699 Accelerating Boolean Matching for Large Functions on CUDA Platform 在CUDA平台上加速大型函數之布林比對 Feng-Ming Chang 張峰銘碩士輔仁大學資訊工程學系碩士班 102 Given two functions with the same number of input variables, Boolean matching is to check if they are equivalent or not under input permutation and input/output phase assignments. Our Boolean matching algorithm adopts an incremental learning approach which has two phases, i.e., the learning phase and the applying phase. In the learning phase, it is to learn and store the knowledge acquired from the partial mappings computed so far. In the applying phase, it is to exploit the learnt knowledge to avoid the redundant computations and thus to speed up the searching process. Moreover, in our algorithm, signatures are used to shrink the solution space. Even though these techniques are applied in our method, it is still time-consuming for dealing with large functions. Recently, the architectural innovation of GPGPUs (General Purpose Graphics Processing Units) gains more attention by many researchers and it had been applied in solving many EDA (Electronic Design Automation) problems. In this thesis, firstly we analyze the incremental learning approach and then propose a weight computing method to accelerate the matching process. By our analysis, most of the running time is spent on two parts - AND procedure and Reduction procedure. Therefore, we parallelize these two procedures on CUDA (Compute Unified Device Architecture) platform. Directly parallelizing the above two procedures has three issues: (1) Using the original design of data structures results in the performance degradation on AND operations; (2) There is not enough GPU memory while matching large functions; (3) Unabling to implement memory coalescing leads to the reading GPU memory very inefficiently. To solve these problems, we propose three techniques to optimize the native parallel procedure, i.e., redesigning the data structures to improve the AND procedure performance, using CUDA stream technique to reduce the memory usage, and transforming the data storage to realize memory coalescing. By the experimental results, the weight computing method can acquire 2.6X performance improvement over the original one. Besides, we took experiments to verify whether our parallel matching algorithm can expedite the matching process or not for large Boolean functions with different input size. Compared to the sequential matching method, the native parallel method and our optimized parallel method can acquire 5X and 30X speed up, respectively. It shows that our optimized parallel Boolean matching algorithm is indeed effective and efficient for large Boolean functions. Kuo-Hua Wang 王國華 2014 學位論文 ; thesis 73 zh-TW
collection	NDLTD
language	zh-TW
format	Others
sources	NDLTD
description	碩士 === 輔仁大學 === 資訊工程學系碩士班 === 102 === Given two functions with the same number of input variables, Boolean matching is to check if they are equivalent or not under input permutation and input/output phase assignments. Our Boolean matching algorithm adopts an incremental learning approach which has two phases, i.e., the learning phase and the applying phase. In the learning phase, it is to learn and store the knowledge acquired from the partial mappings computed so far. In the applying phase, it is to exploit the learnt knowledge to avoid the redundant computations and thus to speed up the searching process. Moreover, in our algorithm, signatures are used to shrink the solution space. Even though these techniques are applied in our method, it is still time-consuming for dealing with large functions. Recently, the architectural innovation of GPGPUs (General Purpose Graphics Processing Units) gains more attention by many researchers and it had been applied in solving many EDA (Electronic Design Automation) problems. In this thesis, firstly we analyze the incremental learning approach and then propose a weight computing method to accelerate the matching process. By our analysis, most of the running time is spent on two parts - AND procedure and Reduction procedure. Therefore, we parallelize these two procedures on CUDA (Compute Unified Device Architecture) platform. Directly parallelizing the above two procedures has three issues: (1) Using the original design of data structures results in the performance degradation on AND operations; (2) There is not enough GPU memory while matching large functions; (3) Unabling to implement memory coalescing leads to the reading GPU memory very inefficiently. To solve these problems, we propose three techniques to optimize the native parallel procedure, i.e., redesigning the data structures to improve the AND procedure performance, using CUDA stream technique to reduce the memory usage, and transforming the data storage to realize memory coalescing. By the experimental results, the weight computing method can acquire 2.6X performance improvement over the original one. Besides, we took experiments to verify whether our parallel matching algorithm can expedite the matching process or not for large Boolean functions with different input size. Compared to the sequential matching method, the native parallel method and our optimized parallel method can acquire 5X and 30X speed up, respectively. It shows that our optimized parallel Boolean matching algorithm is indeed effective and efficient for large Boolean functions.
author2	Kuo-Hua Wang
author_facet	Kuo-Hua Wang Feng-Ming Chang 張峰銘
author	Feng-Ming Chang 張峰銘
spellingShingle	Feng-Ming Chang 張峰銘 Accelerating Boolean Matching for Large Functions on CUDA Platform
author_sort	Feng-Ming Chang
title	Accelerating Boolean Matching for Large Functions on CUDA Platform
title_short	Accelerating Boolean Matching for Large Functions on CUDA Platform
title_full	Accelerating Boolean Matching for Large Functions on CUDA Platform
title_fullStr	Accelerating Boolean Matching for Large Functions on CUDA Platform
title_full_unstemmed	Accelerating Boolean Matching for Large Functions on CUDA Platform
title_sort	accelerating boolean matching for large functions on cuda platform
publishDate	2014
url	http://ndltd.ncl.edu.tw/handle/65491311542595430699
work_keys_str_mv	AT fengmingchang acceleratingbooleanmatchingforlargefunctionsoncudaplatform AT zhāngfēngmíng acceleratingbooleanmatchingforlargefunctionsoncudaplatform AT fengmingchang zàicudapíngtáishàngjiāsùdàxínghánshùzhībùlínbǐduì AT zhāngfēngmíng zàicudapíngtáishàngjiāsùdàxínghánshùzhībùlínbǐduì
_version_	1718383120891248640

Accelerating Boolean Matching for Large Functions on CUDA Platform

Similar Items