Anish Athalye, Nicholas Carlini. On the Robustness of the CVPR 2018 White-Box Adversarial Example Defenses. CoRR abs/1804.03286 (2018).

Athalye and Carlini present experiments showing that pixel deflection [1] and high-level guided denoiser [2] are ineffective as defense against adversarial examples. In particular, they show that these defenses are not effective against the (currently) strongest first-order attack, projected gradient descent. Here, they also comment on the right threat model to use and explicitly state that the attacker would know the employed defense – which intuitively makes much sense when evaluating defenses.

  • [1] Prakash, Aaditya, Moran, Nick, Garber, Solomon, DiLillo, Antonella, and Storer, James. Deflecting adversarial at tacks with pixel deflection. In CVPR, 2018.
  • [2] Liao, Fangzhou, Liang, Ming, Dong, Yinpeng, Pang, Tianyu, Zhu, Jun, and Hu, Xiaolin. Defense against adversarial attacks using high-level representation guided denoiser. In CVPR, 2018.
