Gilmer et al. study the existence of adversarial examples on a synthetic toy datasets consisting of two concentric spheres. The dataset is created by randomly sampling examples from two concentric spheres, one with radius $1$ and one with radius $R = 1.3$. While the authors argue that difference difficulties of the dataset can be created by varying $R$ and the dimensionality, they merely experiment with $R = 1.3$ and a dimensionality of $500$. The motivation to study this dataset comes form the idea that adversarial examples can easily be found by leaving the data manifold. Based on this simple dataset, the authors provide several theoretical insights – see the paper for details.
Beneath theoretical insights, Gilmer et al. also discuss the so-called manifold attack, an attack using projected gradient descent which ensures that the adversarial examples stays on the data-manifold – moreover, it is ensured that the class does not change. Unfortunately (as I can tell), this idea of a manifold attack is not studied further – which is very unfortunate and renders the question while this concept was introduced in the first place.
One of the main take-aways is the suggestion that there is a trade-off between accuracy (i.e. the ability of the network to perform well) and the average distance to an adversarial example. Thus, the existence of adversarial examples might be related to the question why deep neural networks perform very well.
What is your opinion on the summarized work? Or do you know related work that is of interest? Let me know your thoughts in the comments below: