Kanbak et al. propose ManiFool, a method to determine a network’s invariance to transformations by iteratively finding adversarial transformations. In particular, given a class of transformations to consider, ManiFool iteratively alternates two steps. First, a gradient step is taken in order to move into an adversarial direction; then, the obtained perturbation/direction is projected back to the space of allowed transformations. While the details are slightly more involved, I found that this approach is similar to the general projected gradient ascent approach to finding adversarial examples. By finding worst-case transformations for a set of test samples, Kanbak et al. Are able to quantify the invariance of a network against specific transformations. Furthermore, they show that adversarial fine-tuning using the found adversarial transformations allows to boost invariance, while only incurring a small loss in general accuracy. Examples of the found adversarial transformations are shown in Figure 1.
What is your opinion on the summarized work? Or do you know related work that is of interest? Let me know your thoughts in the comments below: