Fawzi et al. propose an adaptive data augmentation scheme based on adversarial transformations similar to adversarial training. In particular, in each training iteration – and for each sample/batch – they compute an adversarial version by finding a transformation that maximizes the training loss. The transformation is usually constrained to a specific class of transformations – on MNIST, for example, they consider affine transformations. Additionally, only small transformations are considered, a constraint very similar to the perturbation constraint of adversarial examples. In order to find these adversarial transformations during training, they employ an iterative algorithm based on a local first-order approximation of the training loss on the transformed sample. This leads to a simple linear program that can be solved in each iteration. Overall, however, this results in a significant computational overhead during training (as does adversarial training, as well). Quantitative results on MNIST are encouraging; specifically, Figure 1 shows that the test loss converges faster when employing these adversarial transformations as data augmentation in contrast to random transformations.