Adversarial examples are commonly assumed to leave the manifold of the underyling data — although this has not been confirmed experimentally so far. This means that deep neural networks perform well on the manifold, however, slight perturbations in directions leaving the manifold may cause mis-classification. In this article, based on my recent CVPR’19 paper, I want to empirically show that adversarial examples indeed leave the manifold. For this purpose, I will present results on a synthetic dataset with known manifold as well as on MNIST with approximated manifold.
Adversarial examples, imperceptibly perturbed examples causing mis-classification, are commonly assumed to lie off the underlying manifold of the data — the so-called manifold assumption. In this article, following my recent CVPR’19 paper, I demonstrate that adversarial examples can also be found on the data manifold, both on a synthetic dataset as well as on MNIST and Fashion-MNIST.