A Binary Translator to Accelerate Development of Deep Learning Processing Library for AArch64 CPU

To accelerate deep learning (DL) processes on the supercomputer Fugaku, the authors have ported and optimized oneDNN for Fugaku's CPU, the Fujitsu A64FX. oneDNN is an open-source DL processing library developed by Intel for the x86 64 architecture. The A64FX CPU is based on the Armv8-A architec...

Full description

Bibliographic Details
Main Authors:	Fukumoto, N. (Author), Honda, T. (Author), Kawakami, K. (Author), Kurihara, K. (Author), Yamazaki, M. (Author)
Format:	Article
Language:	English
Published:	Institute of Electronics Information Communication Engineers 2022
Subjects:	AArch64 binary translator deep learning just-in-time assembler oneDNN
Online Access:	View Fulltext in Publisher


LEADER	02073nam a2200241Ia 4500
001	10.1587-TRANSELE.2021LHP0001
008	220630s2022 CNT 000 0 und d
020			\|a 09168524 (ISSN)
245	1	0	\|a A Binary Translator to Accelerate Development of Deep Learning Processing Library for AArch64 CPU
260		0	\|b Institute of Electronics Information Communication Engineers \|c 2022
520	3		\|a To accelerate deep learning (DL) processes on the supercomputer Fugaku, the authors have ported and optimized oneDNN for Fugaku's CPU, the Fujitsu A64FX. oneDNN is an open-source DL processing library developed by Intel for the x86 64 architecture. The A64FX CPU is based on the Armv8-A architecture. oneDNN dynamically creates the execution code for the computation kernels, which are implemented at the granularity of x86 64 instructions using Xbyak, the Just-In-Time (JIT) assembler for x86 64 architecture. To port oneDNN to A64FX, it must be rewritten into Armv8-A instructions using Xbyak aarch64, the JIT assembler for the Armv8-A architecture. This is challenging because the number of steps to be rewritten exceeds several tens of thousands of lines. This study presents the Xbyak translator aarch64. Xbyak translator aarch64 is a binary translator that at runtime converts dynamically produced executable codes for the x86 64 architecture into executable codes for the Armv8-A architecture. Xbyak translator aarch64 eliminates the need to rewrite the source code for porting oneDNN to A64FX and allows us to port oneDNN to A64FX quickly. Copyright © 2022 The Institute of Electronics, Information and Communication Engineers.
650	0	4	\|a AArch64
650	0	4	\|a binary translator
650	0	4	\|a deep learning
650	0	4	\|a just-in-time assembler
650	0	4	\|a oneDNN
700	1	0	\|a Fukumoto, N. \|e author
700	1	0	\|a Honda, T. \|e author
700	1	0	\|a Kawakami, K. \|e author
700	1	0	\|a Kurihara, K. \|e author
700	1	0	\|a Yamazaki, M. \|e author
773			\|t IEICE Transactions on Electronics
856			\|z View Fulltext in Publisher \|u https://doi.org/10.1587/TRANSELE.2021LHP0001

A Binary Translator to Accelerate Development of Deep Learning Processing Library for AArch64 CPU

Similar Items