Alexey Kurakin, Ian J. Goodfellow, Samy Bengio. Adversarial Machine Learning at Scale. CoRR abs/1611.01236, 2016.

Kurakin et al. present some larger scale experiments using adversarial training on ImageNet to increase robustness. In particular, they claim to be the first using adversarial training on ImageNet. Furthermore, they provide experiments underlining the following conclusions:

  • Adversarial training can also be seen as regularizer. This, however, is not surprising as training on noisy training samples is also known to act as regularization.
  • Label leaking describes the observation that an adversarially trained model is able to defend against (i.e. correctly classify) an adversarial example which has been computed by knowing to true label while not defending against adversarial examples that were crafted without knowing the true label. This means that crafting adversarial examples without guidance by the true label might be beneficial (in terms of a stronger attack).
  • Model complexity seems to have an impact on robustness after adversarial training. However, from the experiments, it is hard to deduce how this connection might look exactly.
Also find this summary on ShortScience.org.
What is your opinion on this article? Let me know your thoughts on Twitter @davidstutz92 or LinkedIn in/davidstutz92.