Variational Bayesian Group-Level Sparsification for Knowledge Distillation

Deep neural networks are capable of learning powerful representation, but often limited by heavy network architectures and high computational cost. Knowledge distillation (KD) is one of the effective ways to perform model compression and inference acceleration. But the final student models remain pa...

Full description

Bibliographic Details
Main Authors: Yue Ming, Hao Fu, Yibo Jiang, Hui Yu
Format: Article
Language:English
Published: IEEE 2020-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9139512/