Robust 2-bit Quantization of Weights in Neural Network Modeled by Laplacian Distribution

Significant efforts are constantly involved in finding manners to decrease the number of bits required for quantization of neural network parameters. Although in addition to compression, in neural networks, the application of quantizer models that are robust to changes in the variance of input dat...

Full description

Bibliographic Details
Main Authors: PERIC, Z., DENIC, B., DINCIC, M., NIKOLIC, J.
Format: Article
Language:English
Published: Stefan cel Mare University of Suceava 2021-08-01
Series:Advances in Electrical and Computer Engineering
Subjects:
Online Access:http://dx.doi.org/10.4316/AECE.2021.03001
Description
Summary:Significant efforts are constantly involved in finding manners to decrease the number of bits required for quantization of neural network parameters. Although in addition to compression, in neural networks, the application of quantizer models that are robust to changes in the variance of input data is of great importance, to the best of authors knowledge, this topic has not been sufficiently researched so far. For that reason, in this paper we give preference to logarithmic companding scalar quantizer, which has shown the best robustness in high quality quantization of speech signals, modelled similarly as weights in neural networks, by Laplacian distribution. We explore its performance by performing the exact and asymptotic analysis for low resolution scenario with 2-bit quantization, where we draw firm conclusions about the usability of the exact performance analysis and design of our quantizer. Moreover, we provide a manner to increase the robustness of the quantizer we propose by involving additional adaptation of the key parameter. Theoretical and experimental results obtained by applying our quantizer in processing of neural network weights are very good matched, and, for that reason, we can expect that our proposal will find a way to practical implementation.
ISSN:1582-7445
1844-7600