ArXiv Pre-Print “Learning Optimal Conformal Classifiers”

Conformal prediction (CP) allows to take any classifier and turn it into a set predictor with a guarantee that the true class is included with user-specified probability. This allows to develop classifiers with sufficient guarantees for safe deployment in many domains. However, CP is usually used as a post-training calibration step. Our paper presented in this article presents a training procedure name conformal training allowing to train classifier and conformal predictor end-to-end. This can reduce the average confidence set size and allows to optimize arbitrary objectives defined directly on the predicted sets.


Figure 1: Coanformal prediction (CP) usually wraps any classifier $\pi_\theta(x)$ and constructs a confidence set $C_\theta$ with coverage guarantees. We develop differentiable prediction and calibration implementations for conformal prediction, allowing to "simulate" CP on each mini-batch $B$ during training. This so-called Conformal training (ConfTR) (a) calibrates on the first half of the batch and predicts confidence sets on the other half. This allows to optimize abritrary losses on the predicted confidence sets, e.g., to reduce average confidence set size or penalize specific classes from being included (b).

Modern deep learning based classifiers show very high accuracy on test data but this does not provide sufficient guarantees for safe deployment, especially in high-stake AI applications such as medical diagnosis. Usually, predictions are obtained without a reliable uncertainty estimate or a formal guarantee. Conformal prediction (CP) addresses these issues by using the classifier's probability estimates to predict confidence sets containing the true class with a user-specified probability. However, using CP as a separate processing step after training prevents the underlying model from adapting to the prediction of confidence sets. Thus, this paper explores strategies to differentiate through CP during training with the goal of training model with the conformal wrapper end-to-end. In our approach, conformal training (ConfTr), we specifically "simulate" conformalization on mini-batches during training. We show that CT outperforms state-of-the-art CP methods for classification by reducing the average confidence set size (inefficiency). Moreover, it allows to "shape" the confidence sets predicted at test time, which is difficult for standard CP. On experiments with several datasets, we show ConfTr can influence how inefficiency is distributed across classes, or guide the composition of confidence sets in terms of the included classes, while retaining the guarantees offered by CP.

Paper on ArXiv
      author = {David Stutz and Krishnamurthy and Dvijotham and Ali Taylan Cemgil and Arnaud Doucet},
      title = {Learning Optimal Conformal Classifiers}, 
      year = {2021},
      volume = {abs/2110.09192},
      journal = {CoRR},

What is your opinion on this article? Let me know your thoughts on Twitter @davidstutz92 or LinkedIn in/davidstutz92.