Chuan Guo, Geoff Pleiss, Yu Sun, Kilian Q. Weinberger. On Calibration of Modern Neural Networks. ICML, 2017.

Guo et al. study calibration of deep neural networks as post-processing step. Here, calibration means a correction of the predicted confidence scores as these are commonlz too overconfident in recent deep networks. They consider several state-of-the-art post-processing steps for calibration, but surprisingly, they show that a simple linear mapping, or even scaling, works surprisingly well. So if $z_i$ are the logits of the network, then (the network being fixed) a parameter $T$ is found such that


is calibrated and minimized the NLL loss on a held-out validation set. Here, the temeratur $T$ either softens or roughens the probability distribution over classes. Interestingly, finding $T$ by optimizing the same training loss helps to reduce over-confidence.

Also find this summary on ShortScience.org.

What is your opinion on the summarized work? Or do you know related work that is of interest? Let me know your thoughts in the comments below: