19^{th}JULY2018

Takeru Miyato, Shin-ichi Maeda, Masanori Koyama, Ken Nakae, Shin Ishii. *Distributional Smoothing with Virtual Adversarial Training*. CoRR, abs/1507.00677, 2017.

Also find this summary on ShortScience.org.

What is **your opinion** on the summarized work? Or do you know related work that is of interest? **Let me know** your thoughts in the comments below:

Miyato et al. propose distributional smoothing (or virtual adversarial training) as defense against adversarial examples. However, I think that both terms do not give a good intuition of what is actually done. Essentially, a regularization term is introduced. Letting $p(y|x,\theta)$ be the learned model, the regularizer is expressed as

$\text{KL}(p(y|x,\theta)|p(y|x+r,\theta)$

where $r$ is the perturbation that maximizes the Kullback-Leibler divergence above, i.e.

$r = \arg\max_r \{\text{KL}(p(y|x,\theta)|p(y|x+r,\theta) | \|r\|_2 \leq \epsilon\}$

with hyper-parameter $\epsilon$. Essentially, the regularizer is supposed to “simulate” adversarial training – thus, the method is also called virtual adversarial training.

The discussed implementation, however, is somewhat cumbersome. In particular, $r$ cannot be computed using first-order methods as the gradient of $\text{KL}$ is $0$ for $r = 0$. So a second-order method is used – for which the Hessian needs to be approximated and the corresponding eigenvectors need to be computed. For me it is unclear why $r$ cannot be initialized randomly to solve this issue … Then, the derivative of the regularizer needs to be computed during training. Here, the authors make several simplifications (such as fixing $\theta$ in the first part of the Kullback-Leibler divergence and ignoring the derivative of $r$ w.r.t. $\theta$).

Overall, however, I like the idea of “virtual” adversarial training as it avoids the need of explicitly using attacks during training to craft adversarial examples. Then, the trained model is often robust against the chosen attacks, but new adversarial examples can be found easily through novel attacks.