Alex Lamb, Vikas Verma, Juho Kannala, Yoshua Bengio. Interpolated Adversarial Training: Achieving Robust Neural Networks Without Sacrificing Too Much Accuracy. AISec@CCS 2019: 95-103.

Lamb et al. propose interpolated adversarial training to increase robustness against adversarial examples. Particularly, a $50\%/50\%$ variant of adversarial training is used, i.e., in each iteration the batch consists of $50\%$ clean and $50\%$ adversarial examples. The loss is then computed on these both parts, encouraging the network to predict the correct labels on the adversarial examples, and averaged afterwards. In interpolated adversarial training, the loss is adapted according to the Mixup strategy. Here, instead of computing the loss on the selected input-output pair, a second input-output pair is selected at random from the dataset. Then, a random linear interpolation between both inputs is considered; this means that the loss is computed as

$\lambda \mathcal{L}(f(x’), y_i) + (1 - \lambda)\mathcal{L}(f(x’), y_j)$

where $f$ is the neural network, $x’$ the interpolated input $x’ = \lambda x_i + (1 - \lambda)x_j$ corresponding to the two input-output pairs $(x_i, y_i)$ and $(x_j, y_j)$. In a variant called Manifold Mixup, the interpolation is performed within a hidden layer instead of the input space. This strategy is applied on both the clean and the adversarial examples and leads, accoridng to the experiments, to the same level of robustness while improving the test accuracy.

Also find this summary on ShortScience.org.
What is your opinion on this article? Let me know your thoughts on Twitter @davidstutz92 or LinkedIn in/davidstutz92.