APQ: Joint Search for Network Architecture, Pruning and Quantization Policy
We present APQ, a novel design methodology for efficient deep learning deployment. Unlike previous methods that separately optimize the neural network architecture, pruning policy, and quantization policy, we design to optimize them in a joint manner. To deal with the larger design space it brings,...
Main Authors: | , , , , , |
---|---|
Other Authors: | , |
Format: | Article |
Language: | English |
Published: |
Institute of Electrical and Electronics Engineers (IEEE),
2021-01-21T19:11:50Z.
|
Subjects: | |
Online Access: | Get fulltext |