Lightweight Compression of Intermediate Neural Network Features for Collaborative Intelligence

In collaborative intelligence applications, part of a deep neural network (DNN) is deployed on a lightweight device such as a mobile phone or edge device, and the remaining portion of the DNN is processed where more computing resources are available, such as in the cloud. This paper presents a novel...

Full description

Bibliographic Details
Main Authors:	Robert A. Cohen, Hyomin Choi, Ivan V. Bajic
Format:	Article
Language:	English
Published:	IEEE 2021-01-01
Series:	IEEE Open Journal of Circuits and Systems
Subjects:	Collaborative intelligence deep learning neural network compression feature compression quantization
Online Access:	https://ieeexplore.ieee.org/document/9430648/

id	doaj-0bba4d1f95a54dd89e882f8d4bd1d703
record_format	Article
spelling	doaj-0bba4d1f95a54dd89e882f8d4bd1d7032021-05-13T23:01:00ZengIEEEIEEE Open Journal of Circuits and Systems2644-12252021-01-01235036210.1109/OJCAS.2021.30728849430648Lightweight Compression of Intermediate Neural Network Features for Collaborative IntelligenceRobert A. Cohen0https://orcid.org/0000-0001-7724-8993Hyomin Choi1Ivan V. Bajic2https://orcid.org/0000-0003-3154-5743School of Engineering Science, Simon Fraser University, Burnaby, CanadaSchool of Engineering Science, Simon Fraser University, Burnaby, CanadaSchool of Engineering Science, Simon Fraser University, Burnaby, CanadaIn collaborative intelligence applications, part of a deep neural network (DNN) is deployed on a lightweight device such as a mobile phone or edge device, and the remaining portion of the DNN is processed where more computing resources are available, such as in the cloud. This paper presents a novel lightweight compression technique designed specifically to quantize and compress the features output by the intermediate layer of a split DNN, without requiring any retraining of the network weights. Mathematical models for estimating the clipping and quantization error of leaky-ReLU and ReLU activations at this intermediate layer are used to compute optimal clipping ranges for coarse quantization. A mathematical model for estimating the clipping and quantization error of leaky-ReLU activations at this intermediate layer is developed and used to compute optimal clipping ranges for coarse quantization. We also present a modified entropy-constrained design algorithm for quantizing clipped activations. When applied to popular object-detection and classification DNNs, we were able to compress the 32-bit floating point intermediate activations down to 0.6 to 0.8 bits, while keeping the loss in accuracy to less than 1%. When compared to HEVC, we found that the lightweight codec consistently provided better inference accuracy, by up to 1.3%. The performance and simplicity of this lightweight compression technique makes it an attractive option for coding an intermediate layer of a split neural network for edge/cloud applications.https://ieeexplore.ieee.org/document/9430648/Collaborative intelligencedeep learningneural network compressionfeature compressionquantization
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Robert A. Cohen Hyomin Choi Ivan V. Bajic
spellingShingle	Robert A. Cohen Hyomin Choi Ivan V. Bajic Lightweight Compression of Intermediate Neural Network Features for Collaborative Intelligence IEEE Open Journal of Circuits and Systems Collaborative intelligence deep learning neural network compression feature compression quantization
author_facet	Robert A. Cohen Hyomin Choi Ivan V. Bajic
author_sort	Robert A. Cohen
title	Lightweight Compression of Intermediate Neural Network Features for Collaborative Intelligence
title_short	Lightweight Compression of Intermediate Neural Network Features for Collaborative Intelligence
title_full	Lightweight Compression of Intermediate Neural Network Features for Collaborative Intelligence
title_fullStr	Lightweight Compression of Intermediate Neural Network Features for Collaborative Intelligence
title_full_unstemmed	Lightweight Compression of Intermediate Neural Network Features for Collaborative Intelligence
title_sort	lightweight compression of intermediate neural network features for collaborative intelligence
publisher	IEEE
series	IEEE Open Journal of Circuits and Systems
issn	2644-1225
publishDate	2021-01-01
description	In collaborative intelligence applications, part of a deep neural network (DNN) is deployed on a lightweight device such as a mobile phone or edge device, and the remaining portion of the DNN is processed where more computing resources are available, such as in the cloud. This paper presents a novel lightweight compression technique designed specifically to quantize and compress the features output by the intermediate layer of a split DNN, without requiring any retraining of the network weights. Mathematical models for estimating the clipping and quantization error of leaky-ReLU and ReLU activations at this intermediate layer are used to compute optimal clipping ranges for coarse quantization. A mathematical model for estimating the clipping and quantization error of leaky-ReLU activations at this intermediate layer is developed and used to compute optimal clipping ranges for coarse quantization. We also present a modified entropy-constrained design algorithm for quantizing clipped activations. When applied to popular object-detection and classification DNNs, we were able to compress the 32-bit floating point intermediate activations down to 0.6 to 0.8 bits, while keeping the loss in accuracy to less than 1%. When compared to HEVC, we found that the lightweight codec consistently provided better inference accuracy, by up to 1.3%. The performance and simplicity of this lightweight compression technique makes it an attractive option for coding an intermediate layer of a split neural network for edge/cloud applications.
topic	Collaborative intelligence deep learning neural network compression feature compression quantization
url	https://ieeexplore.ieee.org/document/9430648/
work_keys_str_mv	AT robertacohen lightweightcompressionofintermediateneuralnetworkfeaturesforcollaborativeintelligence AT hyominchoi lightweightcompressionofintermediateneuralnetworkfeaturesforcollaborativeintelligence AT ivanvbajic lightweightcompressionofintermediateneuralnetworkfeaturesforcollaborativeintelligence
_version_	1721441714722308096

Lightweight Compression of Intermediate Neural Network Features for Collaborative Intelligence

Similar Items