Sharma and Chen provide an experimental comparison of different state-of-the-art attacks against the adversarial training defense by Madry et al. . They consider several attacks, including the Carlini Wagner attacks , elastic net attacks  as well as projected gradient descent . Their experimental finding – that the defense by Madry et al. Can be broken by increasing the allowed perturbation size (i.e., epsilon) – should not be surprising. Every network trained adversarially will only defend reliable against attacks from the attacker used during training.