A Cloud Server Oriented FPGA Accelerator for LSTM Recurrent Neural Network

Long Short-Term Memory network (LSTM) is the most widely used recurrent neural network architecture. It plays an important role in a number of research areas, such as language modeling, machine translation and image captioning. However, owing to its recurrent nature, general-purpose processors like...

Full description

Bibliographic Details
Main Authors:	Jun Liu, Jiasheng Wang, Yu Zhou, Fang Liu
Format:	Article
Language:	English
Published:	IEEE 2019-01-01
Series:	IEEE Access
Subjects:	Long short-term memory FPGA cloud computing recurrent neural network OpenCL
Online Access:	https://ieeexplore.ieee.org/document/8819996/

id	doaj-3a97a08e64cb4b16b373ac8b6ec9a39e
record_format	Article
spelling	doaj-3a97a08e64cb4b16b373ac8b6ec9a39e2021-03-29T23:24:45ZengIEEEIEEE Access2169-35362019-01-01712240812241810.1109/ACCESS.2019.29382348819996A Cloud Server Oriented FPGA Accelerator for LSTM Recurrent Neural NetworkJun Liu0https://orcid.org/0000-0003-4007-6109Jiasheng Wang1Yu Zhou2Fang Liu3Center for Data Science Beijing University of Posts and Telecommunications, Beijing, ChinaCenter for Data Science Beijing University of Posts and Telecommunications, Beijing, ChinaCenter for Data Science Beijing University of Posts and Telecommunications, Beijing, ChinaCenter for Data Science Beijing University of Posts and Telecommunications, Beijing, ChinaLong Short-Term Memory network (LSTM) is the most widely used recurrent neural network architecture. It plays an important role in a number of research areas, such as language modeling, machine translation and image captioning. However, owing to its recurrent nature, general-purpose processors like CPUs and GPGPUs can achieve limited parallelism while consuming high power energy. FPGA accelerators can outperform general-purpose processors with flexibility, energy-efficiency and more delicate optimization capabilities for the recurrent based algorithms. In this paper, we present the design and implementation of a cloud-oriented FPGA accelerator for LSTM. Different from most of previous works designed for embedded systems, our FPGA accelerator transfers data sequences from and to the host server through PCIe and performs multiple time series predictions in parallel. We optimize both the on-chip computation and the communication between the host server and the FPGA board. We perform experiments to evaluate the overall performance as well as the computation and the PCIe communication efforts. The results show that the performance of our implementation is better than the CPU-based and other hardware-based implementations.https://ieeexplore.ieee.org/document/8819996/Long short-term memoryFPGAcloud computingrecurrent neural networkOpenCL
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Jun Liu Jiasheng Wang Yu Zhou Fang Liu
spellingShingle	Jun Liu Jiasheng Wang Yu Zhou Fang Liu A Cloud Server Oriented FPGA Accelerator for LSTM Recurrent Neural Network IEEE Access Long short-term memory FPGA cloud computing recurrent neural network OpenCL
author_facet	Jun Liu Jiasheng Wang Yu Zhou Fang Liu
author_sort	Jun Liu
title	A Cloud Server Oriented FPGA Accelerator for LSTM Recurrent Neural Network
title_short	A Cloud Server Oriented FPGA Accelerator for LSTM Recurrent Neural Network
title_full	A Cloud Server Oriented FPGA Accelerator for LSTM Recurrent Neural Network
title_fullStr	A Cloud Server Oriented FPGA Accelerator for LSTM Recurrent Neural Network
title_full_unstemmed	A Cloud Server Oriented FPGA Accelerator for LSTM Recurrent Neural Network
title_sort	cloud server oriented fpga accelerator for lstm recurrent neural network
publisher	IEEE
series	IEEE Access
issn	2169-3536
publishDate	2019-01-01
description	Long Short-Term Memory network (LSTM) is the most widely used recurrent neural network architecture. It plays an important role in a number of research areas, such as language modeling, machine translation and image captioning. However, owing to its recurrent nature, general-purpose processors like CPUs and GPGPUs can achieve limited parallelism while consuming high power energy. FPGA accelerators can outperform general-purpose processors with flexibility, energy-efficiency and more delicate optimization capabilities for the recurrent based algorithms. In this paper, we present the design and implementation of a cloud-oriented FPGA accelerator for LSTM. Different from most of previous works designed for embedded systems, our FPGA accelerator transfers data sequences from and to the host server through PCIe and performs multiple time series predictions in parallel. We optimize both the on-chip computation and the communication between the host server and the FPGA board. We perform experiments to evaluate the overall performance as well as the computation and the PCIe communication efforts. The results show that the performance of our implementation is better than the CPU-based and other hardware-based implementations.
topic	Long short-term memory FPGA cloud computing recurrent neural network OpenCL
url	https://ieeexplore.ieee.org/document/8819996/
work_keys_str_mv	AT junliu acloudserverorientedfpgaacceleratorforlstmrecurrentneuralnetwork AT jiashengwang acloudserverorientedfpgaacceleratorforlstmrecurrentneuralnetwork AT yuzhou acloudserverorientedfpgaacceleratorforlstmrecurrentneuralnetwork AT fangliu acloudserverorientedfpgaacceleratorforlstmrecurrentneuralnetwork AT junliu cloudserverorientedfpgaacceleratorforlstmrecurrentneuralnetwork AT jiashengwang cloudserverorientedfpgaacceleratorforlstmrecurrentneuralnetwork AT yuzhou cloudserverorientedfpgaacceleratorforlstmrecurrentneuralnetwork AT fangliu cloudserverorientedfpgaacceleratorforlstmrecurrentneuralnetwork
_version_	1724189526056763392

A Cloud Server Oriented FPGA Accelerator for LSTM Recurrent Neural Network

Similar Items