Anish Athalye, Nicholas Carlini. On the Robustness of the CVPR 2018 White-Box Adversarial Example Defenses. CoRR abs/1804.03286 (2018).

Athalye and Carlini present experiments showing that pixel deflection [1] and high-level guided denoiser [2] are ineffective as defense against adversarial examples. In particular, they show that these defenses are not effective against the (currently) strongest first-order attack, projected gradient descent. Here, they also comment on the right threat model to use and explicitly state that the attacker would know the employed defense – which intuitively makes much sense when evaluating defenses.

  • [1] Prakash, Aaditya, Moran, Nick, Garber, Solomon, DiLillo, Antonella, and Storer, James. Deflecting adversarial at tacks with pixel deflection. In CVPR, 2018.
  • [2] Liao, Fangzhou, Liang, Ming, Dong, Yinpeng, Pang, Tianyu, Zhu, Jun, and Hu, Xiaolin. Defense against adversarial attacks using high-level representation guided denoiser. In CVPR, 2018.
Also find this summary on ShortScience.org.

What is your opinion on the summarized work? Or do you know related work that is of interest? Let me know your thoughts in the comments below: