APQ: Joint Search for Network Architecture, Pruning and Quantization Policy

We present APQ, a novel design methodology for efficient deep learning deployment. Unlike previous methods that separately optimize the neural network architecture, pruning policy, and quantization policy, we design to optimize them in a joint manner. To deal with the larger design space it brings,...

Full description

Bibliographic Details
Main Authors: Wang, Tianzhe (Author), Wang, Kuan (Author), Cai, Han (Author), Lin, Ji (Author), Liu, Zhijian (Author), Han, Song (Author)
Other Authors: Massachusetts Institute of Technology. Microsystems Technology Laboratories (Contributor), Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science (Contributor)
Format: Article
Language:English
Published: Institute of Electrical and Electronics Engineers (IEEE), 2021-01-21T19:11:50Z.
Subjects:
Online Access:Get fulltext