Check out our latest research on adversarial robustness and generalization of deep networks.


Tom B. Brown, Dandelion Mané, Aurko Roy, Martín Abadi, Justin Gilmer. Adversarial Patch. CoRR abs/1712.09665 (2017).

Brown et al. introduce a universal adversarial patch that, when added to an image, will cause a targeted misclassification. The concept is illustrated in Figure 1; essentially, a “sticker” is computed that, when placed randomly on an image, causes misclassification. In practice, the objective function optimized can be written as

$\max_p \mathbb{E}_{x\sim X, t \sim T, l \sim L} \log p(y|A(p,x,l,t))$

where $y$ is the target label and $X$, $T$ and $L$ are te data space, the transformation space and the location space, respectively. The function $A$ takes as input the image and the patch and places the adversarial patch on the image according to the transformation and the location $t$ and $p$. Note that the adversarial patch is unconstrained (in contrast to general adversarial examples). In practice, the computed patch might look as illustrated in Figure 1.

Figure 1: Illustration of the optimization procedure to obtain adversarial patches.

Also find this summary on ShortScience.org.

What is your opinion on the summarized work? Or do you know related work that is of interest? Let me know your thoughts in the comments below: