Liu et al. propose slight perturbations of a deep neural network’s weights in order to cause mis-classification on a specific input. Specifically, the authors propose two attacks: the single bias attack, where a single bias value is manipulated in order to cause mis-classification, and the gradient descent attack, where the network’s weights of a particular layer are manipulated through gradient descent to cause mis-classification. In both cases, a specific input example is considered to be fixed. The attack is intended to change the label on this input while being “stealthy”, i.e. not changing accuracy too much. In experiments on MNIST and CIFAR10 it is shown that these attacks are effective in changing the input’s label, however also reduce the overall accuracy of the model.
What is your opinion on the summarized work? Or do you know related work that is of interest? Let me know your thoughts in the comments below: