Carlini and Wagner show that defensive distillation as defense against adversarial examples does not work. Specifically, they show that the attack by Papernot et al  can easily be modified to attack distilled networks. Interestingly, the main change is to introduce a temperature in the last softmax layer. This termperature, when chosen hgih enough will take care of aligning the gradients from the softmax layer and from the logit layer – otherwise, they will have significantly different magnitude. Personally, I found that this also aligns with the observations in  where Carlini and Wagner also find that attack objectives defined on the logits work considerably better.
What is your opinion on the summarized work? Or do you know related work that is of interest? Let me know your thoughts in the comments below: