15^{th}OCTOBER2019

Luca de Alfaro. *Neural Networks with Structural Resistance to Adversarial Attacks*. CoRR abs/1809.09262 (2018).

Also find this summary on ShortScience.org.

What is **your opinion** on this article? **Let me know** your thoughts on Twitter @davidstutz92 or LinkedIn in/davidstutz92.

De Alfaro proposes a deep radial basis function (RBF) network to obtain robustness against adversarial examples. In contrast to “regular” RBF networks, which usually consist of only one hidden layer containing RBF units, de Alfaro proposes to stack multiple layers with RBF units. Specifically, a Gaussian unit utilizing the $L_\infty$ norm is used:

$\exp\left( - \max_i(u_i(x_i – w_i))^2\right)$

where $u_i$ and $w_i$ are parameters and $x_i$ are the inputs to the unit – so the network inputs or the outputs of the previous hidden layer. This unit can be understood as computing a soft AND operation; therefore, an alternative OR operation

$1 - \exp\left( - \max_i(u_i(x_i – w_i))^2\right)$

is used as well. These two units are used alternatingly in hidden layers in the conducted experiments. Based on these units, de Alfaro argues that the model is less sensitive to adversarial examples, compared to linear operations as commonly used in ReLU networks.

For training a deep RBF-network, pseudo gradients are used for both the maximum operation and the exponential function. This is done for simplifying training; I refer to the paper for details.

In their experiments, on MNIST, a multi-layer perceptron with the proposed RBF units is used. The network consists of 512 AND units, 512 OR units, 512 AND units and finally 10 OR units. Robustness against FGSM and I-FGSM as well as PGD attacks seems to improve. However, the used PGD attack seems to be weaker than usually, it does not manage to reduce adversarial accuracy of a normal networks to near-zero.