Gehr et al. propose a method based on abstract interpretations in order to verify robustness guarantees of neural networks. First of all, I want to note that (in contrast to most work in adversarial robustness) the proposed method is not intended to improve robustness, but to get robustness certificates. Without going into details, abstract interpretations allow to verify conditions (e.g., robustness) of a function (e.g., a neural network) based on abstractions of the input. In particular, by abstracting a norm-ball around a test sample (as is typically considered in adversarial robustness) using box constraints or polyhedra, leading to an over-approximation of the norm-ball, and transforming these abstractions according to the layers of a network, the network’s output can be checked against robustness conditions without running the network on all individual points in the norm-ball. As a result, if the proposed method certifies robustness for a given input sample and an area around it, the network s indeed robust in this area (soundness). If not, the network might indeed not be robust, or robustness could not be certified due to the method’s over-approximation. For details, I refer to the paper, as well as follow-up work  and .
What is your opinion on the summarized work? Or do you know related work that is of interest? Let me know your thoughts in the comments below: