Learning Sparse Low-Precision Neural Networks With Learnable Regularization

We consider learning deep neural networks (DNNs) that consist of low-precision weights and activations for efficient inference of fixed-point operations. In training low-precision networks, gradient descent in the backward pass is performed with high-precision weights while quantized low-precision w...

Full description

Bibliographic Details
Main Authors:	Yoojin Choi, Mostafa El-Khamy, Jungwon Lee
Format:	Article
Language:	English
Published:	IEEE 2020-01-01
Series:	IEEE Access
Subjects:	Deep neural networks fixed-point arithmetic model compression quantization regularization weight pruning
Online Access:	https://ieeexplore.ieee.org/document/9098870/

id	doaj-3cb201710a8f4aca9ed96f415c3da57d
record_format	Article
spelling	doaj-3cb201710a8f4aca9ed96f415c3da57d2021-03-30T02:16:52ZengIEEEIEEE Access2169-35362020-01-018969639697410.1109/ACCESS.2020.29969369098870Learning Sparse Low-Precision Neural Networks With Learnable RegularizationYoojin Choi0https://orcid.org/0000-0002-4496-0738Mostafa El-Khamy1https://orcid.org/0000-0001-9421-6037Jungwon Lee2SoC Research and Development, Samsung Semiconductor Inc., San Diego, CA, USASoC Research and Development, Samsung Semiconductor Inc., San Diego, CA, USASoC Research and Development, Samsung Semiconductor Inc., San Diego, CA, USAWe consider learning deep neural networks (DNNs) that consist of low-precision weights and activations for efficient inference of fixed-point operations. In training low-precision networks, gradient descent in the backward pass is performed with high-precision weights while quantized low-precision weights and activations are used in the forward pass to calculate the loss function for training. Thus, the gradient descent becomes suboptimal, and accuracy loss follows. In order to reduce the mismatch in the forward and backward passes, we utilize mean squared quantization error (MSQE) regularization. In particular, we propose using a learnable regularization coefficient with the MSQE regularizer to reinforce the convergence of high-precision weights to their quantized values. We also investigate how partial L2 regularization can be employed for weight pruning in a similar manner. Finally, combining weight pruning, quantization, and entropy coding, we establish a low-precision DNN compression pipeline. In our experiments, the proposed method yields low-precision MobileNet and ShuffleNet models on ImageNet classification with the state-of-the-art compression ratios of 7.13 and 6.79, respectively. Moreover, we examine our method for image super resolution networks to produce 8-bit low-precision models at negligible performance loss.https://ieeexplore.ieee.org/document/9098870/Deep neural networksfixed-point arithmeticmodel compressionquantizationregularizationweight pruning
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Yoojin Choi Mostafa El-Khamy Jungwon Lee
spellingShingle	Yoojin Choi Mostafa El-Khamy Jungwon Lee Learning Sparse Low-Precision Neural Networks With Learnable Regularization IEEE Access Deep neural networks fixed-point arithmetic model compression quantization regularization weight pruning
author_facet	Yoojin Choi Mostafa El-Khamy Jungwon Lee
author_sort	Yoojin Choi
title	Learning Sparse Low-Precision Neural Networks With Learnable Regularization
title_short	Learning Sparse Low-Precision Neural Networks With Learnable Regularization
title_full	Learning Sparse Low-Precision Neural Networks With Learnable Regularization
title_fullStr	Learning Sparse Low-Precision Neural Networks With Learnable Regularization
title_full_unstemmed	Learning Sparse Low-Precision Neural Networks With Learnable Regularization
title_sort	learning sparse low-precision neural networks with learnable regularization
publisher	IEEE
series	IEEE Access
issn	2169-3536
publishDate	2020-01-01
description	We consider learning deep neural networks (DNNs) that consist of low-precision weights and activations for efficient inference of fixed-point operations. In training low-precision networks, gradient descent in the backward pass is performed with high-precision weights while quantized low-precision weights and activations are used in the forward pass to calculate the loss function for training. Thus, the gradient descent becomes suboptimal, and accuracy loss follows. In order to reduce the mismatch in the forward and backward passes, we utilize mean squared quantization error (MSQE) regularization. In particular, we propose using a learnable regularization coefficient with the MSQE regularizer to reinforce the convergence of high-precision weights to their quantized values. We also investigate how partial L2 regularization can be employed for weight pruning in a similar manner. Finally, combining weight pruning, quantization, and entropy coding, we establish a low-precision DNN compression pipeline. In our experiments, the proposed method yields low-precision MobileNet and ShuffleNet models on ImageNet classification with the state-of-the-art compression ratios of 7.13 and 6.79, respectively. Moreover, we examine our method for image super resolution networks to produce 8-bit low-precision models at negligible performance loss.
topic	Deep neural networks fixed-point arithmetic model compression quantization regularization weight pruning
url	https://ieeexplore.ieee.org/document/9098870/
work_keys_str_mv	AT yoojinchoi learningsparselowprecisionneuralnetworkswithlearnableregularization AT mostafaelkhamy learningsparselowprecisionneuralnetworkswithlearnableregularization AT jungwonlee learningsparselowprecisionneuralnetworkswithlearnableregularization
_version_	1724185508869832704

Learning Sparse Low-Precision Neural Networks With Learnable Regularization

Similar Items