Dongyu Meng, Hao Chen. MagNet: A Two-Pronged Defense against Adversarial Examples. ACM Conference on Computer and Communications Security, 2017.

Meng and Chen propose MagNet, a combination of adversarial example detection and removal. At test time, given a clean or adversarial test image, the proposed defense works as follows: First, the input is passed through one or multiple detectors. If one of these detectors fires, the input is rejected. To this end, the authors consider detection based on the reconstruction error of an auto-encoder or detection based on the divergence between probability predictions (on adversarial vs. clean example). Second, if not rejected, the input is passed through a reformed. The reformer reconstructs the input, e.g., through an auto-encoder, to remove potentially undetected adversarial noise.

Also find this summary on ShortScience.org.
What is your opinion on this article? Let me know your thoughts on Twitter @davidstutz92 or LinkedIn in/davidstutz92.