Luis Muñoz-González, Battista Biggio, Ambra Demontis, Andrea Paudice, Vasin Wongrassamee, Emil C. Lupu, Fabio Roli. Towards Poisoning of Deep Learning Algorithms with Back-gradient Optimization. AISec@CCS 2017.

Munoz-Gonzalez et al. propose a multi-class data poisening attack against deep neural networks based on back-gradient optimization. They consider the common poisening formulation stated as follows:

$\max_{D_c} \min_w \mathcal{L}(D_c \cup D_{tr}, w)$

where $D_c$ denotes a set of poisened training samples and $D_{tr}$ the corresponding clean dataset. Here, the loss $\mathcal{L}$ used for training is minimized as the inner optimization problem. As result, as long as learning itself does not have closed-form solutions, e.g., for deep neural networks, the problem is computationally infeasible. To resolve this problem, the authors propose using back-gradient optimization. Then, the gradient with respect to the outer optimization problem can be computed while only computing a limited number of iterations to solve the inner problem, see the paper for detail. In experiments, on spam/malware detection and digit classification, the approach is shown to increase test error of the trained model with only few training examples poisened.

Also find this summary on ShortScience.org.
What is your opinion on this article? Let me know your thoughts on Twitter @davidstutz92 or LinkedIn in/davidstutz92.