Aman Sinha, Hongseok Namkoong, John C. Duchi. Certifiable Distributional Robustness with Principled Adversarial Training. CoRR abs/1710.10571, 2017.

Sinha et al. introduce a variant of adversarial training based on distributional robust optimization. I strongly recommend reading the paper for understanding the introduced theoretical framework. The authors also provide guarantees on the obtained adversarial loss – and show experimentally that this guarantee is a realistic indicator. The adversarial training variant itself follows the general strategy of training on adversarially perturbed training samples in a min-max framework. In each iteration, an attacker crafts an adversarial examples which the network is trained on. In a nutshell, their approach differs from previous ones (apart from the theoretical framework) in the used attacker. Specifically, their attacker optimizes

$\arg\max_z l(\theta, z) - \gamma \|z – z^t\|_p^2$

where $z^t$ is a training sample chosen randomly during training. On a side note, I also recommend reading the reviews of this paper.

Also find this summary on ShortScience.org.
What is your opinion on this article? Let me know your thoughts on Twitter @davidstutz92 or LinkedIn in/davidstutz92.