Aman Sinha, Hongseok Namkoong, John C. Duchi. Certifiable Distributional Robustness with Principled Adversarial Training. CoRR abs/1710.10571, 2017.

Sinha et al. introduce a variant of adversarial training based on distributional robust optimization. I strongly recommend reading the paper for understanding the introduced theoretical framework. The authors also provide guarantees on the obtained adversarial loss – and show experimentally that this guarantee is a realistic indicator. The adversarial training variant itself follows the general strategy of training on adversarially perturbed training samples in a min-max framework. In each iteration, an attacker crafts an adversarial examples which the network is trained on. In a nutshell, their approach differs from previous ones (apart from the theoretical framework) in the used attacker. Specifically, their attacker optimizes

$\arg\max_z l(\theta, z) - \gamma \|z – z^t\|_p^2$

where $z^t$ is a training sample chosen randomly during training. On a side note, I also recommend reading the reviews of this paper.

Also find this summary on ShortScience.org.

What is your opinion on the summarized work? Or do you know related work that is of interest? Let me know your thoughts in the comments below: