IAM

APRIL2019

READING

Chuan Guo, Geoff Pleiss, Yu Sun, Kilian Q. Weinberger. On Calibration of Modern Neural Networks. ICML, 2017.

Guo et al. study calibration of deep neural networks as post-processing step. Here, calibration means a correction of the predicted confidence scores as these are commonlz too overconfident in recent deep networks. They consider several state-of-the-art post-processing steps for calibration, but surprisingly, they show that a simple linear mapping, or even scaling, works surprisingly well. So if $z_i$ are the logits of the network, then (the network being fixed) a parameter $T$ is found such that

$\sigma(\frac{z_i}{T})$

is calibrated and minimized the NLL loss on a held-out validation set. Here, the temeratur $T$ either softens or roughens the probability distribution over classes. Interestingly, finding $T$ by optimizing the same training loss helps to reduce over-confidence.

Also find this summary on ShortScience.org.
What is your opinion on this article? Let me know your thoughts on Twitter @davidstutz92 or LinkedIn in/davidstutz92.