As outlined in previous articles, there seems to be a significant difference between regular, unconstrained adversarial examples and adversarial examples constrained to the data manifold. In this article, I want to demonstrate that adversarial training with on-manifold adversarial examples has the potential to improve generalization if the manifold is known or approximated well enough. As alternative, for more complex datasets, knowledge of parts of the manifold is sufficient, leading to a kind of adversarial data augmentation using affine transformations.
Adversarial examples are commonly assumed to leave the manifold of the underyling data — although this has not been confirmed experimentally so far. This means that deep neural networks perform well on the manifold, however, slight perturbations in directions leaving the manifold may cause mis-classification. In this article, based on my recent CVPR’19 paper, I want to empirically show that adversarial examples indeed leave the manifold. For this purpose, I will present results on a synthetic dataset with known manifold as well as on MNIST with approximated manifold.