DAVIDSTUTZ

23thJULY2018

$\min_\theta \sum_i \max_{r \in U_i} J(\theta, x_i + r, y_i)$
where $U_i$ is called the uncertainty set corresponding to sample $x_i$ – in the context of adversarial examples, this might be an $\epsilon$-ball around the sample quantifying the maximum perturbation allowed; $(x_i, y_i)$ are training samples, $\theta$ the parameters and $J$ the trianing objective. In practice, when the overall minimization problem is tackled using gradient descent, the inner maximization problem cannot be solved exactly (as this would be inefficient). Instead Shaham et al. propose to alternatingly make single steps both for the minimization and the maximization problems – in the spirit of generative adversarial network training.