Fawzi et al. propose an adaptive data augmentation scheme based on adversarial transformations similar to adversarial training. In particular, in each training iteration – and for each sample/batch – they compute an adversarial version by finding a transformation that maximizes the training loss. The transformation is usually constrained to a specific class of transformations – on MNIST, for example, they consider affine transformations. Additionally, only small transformations are considered, a constraint very similar to the perturbation constraint of adversarial examples. In order to find these adversarial transformations during training, they employ an iterative algorithm based on a local first-order approximation of the training loss on the transformed sample. This leads to a simple linear program that can be solved in each iteration. Overall, however, this results in a significant computational overhead during training (as does adversarial training, as well). Quantitative results on MNIST are encouraging; specifically, Figure 1 shows that the test loss converges faster when employing these adversarial transformations as data augmentation in contrast to random transformations.
What is your opinion on the summarized work? Or do you know related work that is of interest? Let me know your thoughts in the comments below: