A Binary Translator to Accelerate Development of Deep Learning Processing Library for AArch64 CPU

To accelerate deep learning (DL) processes on the supercomputer Fugaku, the authors have ported and optimized oneDNN for Fugaku's CPU, the Fujitsu A64FX. oneDNN is an open-source DL processing library developed by Intel for the x86 64 architecture. The A64FX CPU is based on the Armv8-A architec...

Full description

Bibliographic Details
Main Authors: Fukumoto, N. (Author), Honda, T. (Author), Kawakami, K. (Author), Kurihara, K. (Author), Yamazaki, M. (Author)
Format: Article
Language:English
Published: Institute of Electronics Information Communication Engineers 2022
Subjects:
Online Access:View Fulltext in Publisher
LEADER 02073nam a2200241Ia 4500
001 10.1587-TRANSELE.2021LHP0001
008 220630s2022 CNT 000 0 und d
020 |a 09168524 (ISSN) 
245 1 0 |a A Binary Translator to Accelerate Development of Deep Learning Processing Library for AArch64 CPU 
260 0 |b Institute of Electronics Information Communication Engineers  |c 2022 
520 3 |a To accelerate deep learning (DL) processes on the supercomputer Fugaku, the authors have ported and optimized oneDNN for Fugaku's CPU, the Fujitsu A64FX. oneDNN is an open-source DL processing library developed by Intel for the x86 64 architecture. The A64FX CPU is based on the Armv8-A architecture. oneDNN dynamically creates the execution code for the computation kernels, which are implemented at the granularity of x86 64 instructions using Xbyak, the Just-In-Time (JIT) assembler for x86 64 architecture. To port oneDNN to A64FX, it must be rewritten into Armv8-A instructions using Xbyak aarch64, the JIT assembler for the Armv8-A architecture. This is challenging because the number of steps to be rewritten exceeds several tens of thousands of lines. This study presents the Xbyak translator aarch64. Xbyak translator aarch64 is a binary translator that at runtime converts dynamically produced executable codes for the x86 64 architecture into executable codes for the Armv8-A architecture. Xbyak translator aarch64 eliminates the need to rewrite the source code for porting oneDNN to A64FX and allows us to port oneDNN to A64FX quickly. Copyright © 2022 The Institute of Electronics, Information and Communication Engineers. 
650 0 4 |a AArch64 
650 0 4 |a binary translator 
650 0 4 |a deep learning 
650 0 4 |a just-in-time assembler 
650 0 4 |a oneDNN 
700 1 0 |a Fukumoto, N.  |e author 
700 1 0 |a Honda, T.  |e author 
700 1 0 |a Kawakami, K.  |e author 
700 1 0 |a Kurihara, K.  |e author 
700 1 0 |a Yamazaki, M.  |e author 
773 |t IEICE Transactions on Electronics 
856 |z View Fulltext in Publisher  |u https://doi.org/10.1587/TRANSELE.2021LHP0001