Moustapha Cissé, Piotr Bojanowski, Edouard Grave, Yann Dauphin, Nicolas Usunier. Parseval Networks: Improving Robustness to Adversarial Examples. ICML 2017: 854-863.

Cisse et al. propose parseval networks, deep neural networks regularized to learn orthonormal weight matrices. Similar to the work by Hein et al. [1], the mean idea is to constrain the Lipschitz constant of the network – which essentially means constraining the Lipschitz constants of each layer independently. For weight matrices, this can be achieved by constraining the matrix-norm. However, this (depending on the norm used) is often intractable during gradient descent training. Therefore, Cisse et al. propose to use a per-layer regularizer of the form:

$R(W) = \|W^TW – I\|$

where $I$ is the identity matrix. During training, this regularizer is supposed to ensure that the learned weigth matrices are orthonormal – an efficient alternative to regular matrix manifold optimization techniques (see the paper).

  • [1] Matthias Hein, Maksym Andriushchenko: Formal Guarantees on the Robustness of a Classifier against Adversarial Manipulation. CoRR abs/1705.08475 (2017)
Also find this summary on ShortScience.org.
What is your opinion on this article? Let me know your thoughts on Twitter @davidstutz92 or LinkedIn in/davidstutz92.