ArXiv Pre-Print “Disentangling Adversarial Robustness and Generalization”

To date, it is unclear whether we can obtain both accurate and robust deep networks — meaning deep networks that generalize well and resist adversarial examples. In this pre-print, we aim to disentangle the relationship between adversarial robustness and generalization. The paper is available on ArXiv.

This paper has been accepted to CVPR'19! See the project page for more details.


Figure 1: Adversarial examples and their (normalized) difference to the original test image in the context of the underlying class manifolds on EMNIST [] (left). Adversarial examples constrained to the manifold, so-called on-manifold adversarial examples, on EMNIST and Fashion-MNIST [] (right).

Obtaining deep networks that are robust against adversarial examples and generalize well is an open problem. A recent hypothesis [][] even states that both robust and accurate models are impossible, i.e., adversarial robustness and generalization are conflicting goals. In an effort to clarify the relationship between robustness and generalization, we assume an underlying, low-dimensional data manifold and show that: 1. regular adversarial examples leave the manifold; 2. adversarial examples constrained to the manifold, i.e., on-manifold adversarial examples, exist; 3. on-manifold adversarial examples are generalization errors, and on-manifold adversarial training boosts generalization; 4. and regular robustness is independent of generalization. These assumptions imply that both robust and accurate models are possible. However, different models (architectures, training strategies etc.) can exhibit different robustness and generalization characteristics. To confirm our claims, we present extensive experiments on synthetic data (with access to the true manifold) as well as on EMNIST [], Fashion-MNIST [] and CelebA [].

Paper on ArXiv
    author    = {David Stutz and Matthias Hein and Bernt Schiele},
    title     = {Disentangling Adversarial Robustness and Generalization},
    journal   = {CoRR},
    volume    = {abs/1812.00740},
    year      = {2018},
  • [] D. Su, H. Zhang, H. Chen, J. Yi, P.-Y. Chen, and Y. Gao. Is robustness the cost of accuracy? – a comprehensive study on the robustness of 18 deep image classification models. arXiv.org, abs/1808.01688, 2018.
  • [] D. Tsipras, S. Santurkar, L. Engstrom, A. Turner, and A. Madry. Robustness may be at odds with accuracy. arXiv.org, abs/1805.12152, 2018.
  • [] G. Cohen, S. Afshar, J. Tapson, and A. van Schaik. EMNIST: an extension of MNIST to handwritten letters. arXiv.org, abs/1702.05373, 2017.
  • [] H. Xiao, K. Rasul, and R. Vollgraf. Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms. arXiv.org, abs/1708.07747, 2017.
  • [] Z. Liu, P. Luo, X. Wang, and X. Tang. Deep learning face attributes in the wild. In ICCV, 2015.
What is your opinion on this article? Let me know your thoughts on Twitter @davidstutz92 or LinkedIn in/davidstutz92.