Summary: | In this paper, we propose a novel robust visual classification framework that uses double quantization (dquant) to defend against adversarial examples in a specific attack scenario called “subsequent adversarial examples” where test images are injected with adversarial noise. The proposed system can remove the adversarial noise completely on this particular attack scenario. First, we analyze the potential sources of adversarial noise and classify adversarial examples into three classes. We then propose a novel effective solution, dquant, to target a specific class of adversarial examples. The first quantization is 1-bit dithering applied to both training and test images. The second one is linear quantization, which is applied to test images just before being inputted to a model to remove any adversarial noise. The linear quantizer guarantees that original 1-bit test images will be restored regardless of adversarial noise distance, and, therefore, dquant maintains identical accuracy whether or not the model is under attack. The results show that dquant achieves comparable accuracy, 85.28% on the CIFAR-10 and 94.99% on the Oxford-IIIT Pet datasets against three state-of-the-art adversaries with even a previously untested maximum adversarial distance of 64.
|