Robustness may be at odds with accuracy

We show that there exists an inherent tension between the goal of adversarial robustness and that of standard generalization. Specifically, training robust models may not only be more resource-consuming, but also lead to a reduction of standard accuracy. We demonstrate that this trade-off between th...

Full description

Bibliographic Details
Main Authors:	Tsipras, Dimitris (Author), Santurkar, Shibani (Shibani Vinay) (Author), Engstrom, Logan G. (Author), Turner, Alexander M. (Author), Madry, Aleksander (Author)
Other Authors:	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science (Contributor)
Format:	Article
Language:	English
Published:	ICLR, 2021-03-05T14:59:33Z.
Subjects:	Article
Online Access:	Get fulltext


LEADER	01739 am a22002173u 4500
001	130090
042			\|a dc
100	1	0	\|a Tsipras, Dimitris \|e author
100	1	0	\|a Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science \|e contributor
700	1	0	\|a Santurkar, Shibani \|q (Shibani Vinay) \|e author
700	1	0	\|a Engstrom, Logan G. \|e author
700	1	0	\|a Turner, Alexander M. \|e author
700	1	0	\|a Madry, Aleksander \|e author
245	0	0	\|a Robustness may be at odds with accuracy
260			\|b ICLR, \|c 2021-03-05T14:59:33Z.
856			\|z Get fulltext \|u https://hdl.handle.net/1721.1/130090
520			\|a We show that there exists an inherent tension between the goal of adversarial robustness and that of standard generalization. Specifically, training robust models may not only be more resource-consuming, but also lead to a reduction of standard accuracy. We demonstrate that this trade-off between the standard accuracy of a model and its robustness to adversarial perturbations provably exists even in a fairly simple and natural setting. These findings also corroborate a similar phenomenon observed in practice. Further, we argue that this phenomenon is a consequence of robust classifiers learning fundamentally different feature representations than standard classifiers. These differences, in particular, seem to result in unexpected benefits: the features learned by robust models tend to align better with salient data characteristics and human perception.
520			\|a National Science Foundation (U.S.) (Grants S-1447786, IIS-1607189, CCF-1563880,CCF-1553428)
546			\|a en
655	7		\|a Article
773			\|t 7th International Conference on Learning Representations, ICLR 2019

Robustness may be at odds with accuracy

Similar Items