APQ: Joint Search for Network Architecture, Pruning and Quantization Policy

We present APQ, a novel design methodology for efficient deep learning deployment. Unlike previous methods that separately optimize the neural network architecture, pruning policy, and quantization policy, we design to optimize them in a joint manner. To deal with the larger design space it brings,...

Full description

Bibliographic Details
Main Authors:	Wang, Tianzhe (Author), Wang, Kuan (Author), Cai, Han (Author), Lin, Ji (Author), Liu, Zhijian (Author), Han, Song (Author)
Other Authors:	Massachusetts Institute of Technology. Microsystems Technology Laboratories (Contributor), Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science (Contributor)
Format:	Article
Language:	English
Published:	Institute of Electrical and Electronics Engineers (IEEE), 2021-01-21T19:11:50Z.
Subjects:	Article
Online Access:	Get fulltext

Internet

Get fulltext

APQ: Joint Search for Network Architecture, Pruning and Quantization Policy

Internet

Similar Items