Wieland Brendel, Matthias Bethge. Comment on "Biologically inspired protection of deep networks from adversarial attacks". CoRR abs/1704.01547, 2017.

Brendel et al. propose a decision-based black-box attacks against (deep convolutional) neural networks. Specifically, the so-called Boundary Attack starts with a random adversarial example (i.e. random noise that is not classified as the image to be attacked) and randomly perturbs this initialization to move closer to the target image while remaining misclassified. In pseudo code, the algorithm is described in Algorithm 1. Key component is the proposal distribution $P$ used to guide the adversarial perturbation in each step. In practice, they use a maximum-entropy distribution (e.g. uniform) with a couple of constraints: the perturbed sample is a valid image; the perturbation has a specified relative size, i.e. $\|\eta^k\|_2 = \delta d(o, \tilde{o}^{k-1})$; and the perturbation reduces the distance to the target image $o$: $d(o, \tilde{o}^{k-1}) – d(o,\tilde{o}^{k-1} + \eta^k)=\epsilon d(o, \tilde{o}^{k-1})$. This is approximated by sampling from a standard Gaussian, clipping and rescaling and projecting the perturbation onto the $\epsilon$-sphere around the image. In experiments, they show that this attack is competitive to white-box attacks and can attack real-world systems.

Algorithm 1: Minimal pseudo code version of the boundary attack.

Also find this summary on ShortScience.org.
What is your opinion on this article? Let me know your thoughts on Twitter @davidstutz92 or LinkedIn in/davidstutz92.