HI-FFT: Heterogeneous Parallel In-Place Algorithm for Large-Scale 2D-FFT

Fast Fourier Transform (FFT) is a fundamental operation for 2D data in various applications. To accelerate large-scale 2D-FFT computation, we propose a Heterogeneous parallel In-place 2D-FFT algorithm, HI-FFT. Our novel work decomposition method makes it possible to run our parallel algorithm on the...

Full description

Bibliographic Details
Main Authors: Homin Kang, Jaehong Lee, Duksu Kim
Format: Article
Language:English
Published: IEEE 2021-01-01
Series:IEEE Access
Subjects:
CPU
GPU
Online Access:https://ieeexplore.ieee.org/document/9524622/
id doaj-1a7dd4daae194ff7a68e7600aae5e708
record_format Article
spelling doaj-1a7dd4daae194ff7a68e7600aae5e7082021-09-03T23:00:31ZengIEEEIEEE Access2169-35362021-01-01912026112027310.1109/ACCESS.2021.31084049524622HI-FFT: Heterogeneous Parallel In-Place Algorithm for Large-Scale 2D-FFTHomin Kang0Jaehong Lee1https://orcid.org/0000-0002-8311-5975Duksu Kim2https://orcid.org/0000-0002-9075-3983School of Computer Engineering, Korea University of Technology and Education (KOREATECH), Cheonan, South KoreaSchool of Computer Engineering, Korea University of Technology and Education (KOREATECH), Cheonan, South KoreaSchool of Computer Engineering, Korea University of Technology and Education (KOREATECH), Cheonan, South KoreaFast Fourier Transform (FFT) is a fundamental operation for 2D data in various applications. To accelerate large-scale 2D-FFT computation, we propose a Heterogeneous parallel In-place 2D-FFT algorithm, HI-FFT. Our novel work decomposition method makes it possible to run our parallel algorithm on the original data (i.e., in-place), unlike prior parallel algorithms that require additional memory space (i.e., out-of-place) to guarantee independence among sub-tasks. Our work decomposition method also removes the duplicated operations on the out-of-place approaches. Using our decomposition method, we introduced an in-place heterogeneous parallel algorithm that utilizes both multi-core CPU and GPU simultaneously. To maximize the utilization efficiency of the computing resources, we also propose a priority-based dynamic scheduling method. We compared the performance of seven different 2D-FFT algorithms, including ours, for large-scale 2D-FFT problems whose sizes varied from 20K<sup>2</sup> to 120K<sup>2</sup>. As a result, we found that our method achieved up to 2.92 and 4.42 times higher performance than the conventional homogeneous parallel algorithms based on the state-of-the-art CPU and GPU libraries, respectively. Also, our method showed up to 2.27 times higher performance than the prior heterogeneous algorithms while requiring two times less memory space. To check the benefit of our HI-FFT on an actual application, we applied it to a CGH (Computer Generated Holography) process. We found that it successfully reduces the hologram generation time. These results demonstrate the advantage of our approach for large-scale 2D-FFT computation.https://ieeexplore.ieee.org/document/9524622/2D-FFTheterogeneousparallelCPUGPUin-place
collection DOAJ
language English
format Article
sources DOAJ
author Homin Kang
Jaehong Lee
Duksu Kim
spellingShingle Homin Kang
Jaehong Lee
Duksu Kim
HI-FFT: Heterogeneous Parallel In-Place Algorithm for Large-Scale 2D-FFT
IEEE Access
2D-FFT
heterogeneous
parallel
CPU
GPU
in-place
author_facet Homin Kang
Jaehong Lee
Duksu Kim
author_sort Homin Kang
title HI-FFT: Heterogeneous Parallel In-Place Algorithm for Large-Scale 2D-FFT
title_short HI-FFT: Heterogeneous Parallel In-Place Algorithm for Large-Scale 2D-FFT
title_full HI-FFT: Heterogeneous Parallel In-Place Algorithm for Large-Scale 2D-FFT
title_fullStr HI-FFT: Heterogeneous Parallel In-Place Algorithm for Large-Scale 2D-FFT
title_full_unstemmed HI-FFT: Heterogeneous Parallel In-Place Algorithm for Large-Scale 2D-FFT
title_sort hi-fft: heterogeneous parallel in-place algorithm for large-scale 2d-fft
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2021-01-01
description Fast Fourier Transform (FFT) is a fundamental operation for 2D data in various applications. To accelerate large-scale 2D-FFT computation, we propose a Heterogeneous parallel In-place 2D-FFT algorithm, HI-FFT. Our novel work decomposition method makes it possible to run our parallel algorithm on the original data (i.e., in-place), unlike prior parallel algorithms that require additional memory space (i.e., out-of-place) to guarantee independence among sub-tasks. Our work decomposition method also removes the duplicated operations on the out-of-place approaches. Using our decomposition method, we introduced an in-place heterogeneous parallel algorithm that utilizes both multi-core CPU and GPU simultaneously. To maximize the utilization efficiency of the computing resources, we also propose a priority-based dynamic scheduling method. We compared the performance of seven different 2D-FFT algorithms, including ours, for large-scale 2D-FFT problems whose sizes varied from 20K<sup>2</sup> to 120K<sup>2</sup>. As a result, we found that our method achieved up to 2.92 and 4.42 times higher performance than the conventional homogeneous parallel algorithms based on the state-of-the-art CPU and GPU libraries, respectively. Also, our method showed up to 2.27 times higher performance than the prior heterogeneous algorithms while requiring two times less memory space. To check the benefit of our HI-FFT on an actual application, we applied it to a CGH (Computer Generated Holography) process. We found that it successfully reduces the hologram generation time. These results demonstrate the advantage of our approach for large-scale 2D-FFT computation.
topic 2D-FFT
heterogeneous
parallel
CPU
GPU
in-place
url https://ieeexplore.ieee.org/document/9524622/
work_keys_str_mv AT hominkang hifftheterogeneousparallelinplacealgorithmforlargescale2dfft
AT jaehonglee hifftheterogeneousparallelinplacealgorithmforlargescale2dfft
AT duksukim hifftheterogeneousparallelinplacealgorithmforlargescale2dfft
_version_ 1717815728781918208