Last week, I presented our work on Monte Carlo conformal prediction — conformal prediction with ambiguous and uncertain ground truth — at the Vanderbilt Machine Learning Seminar Series. In this work, we show how to adapt standard conformal prediction if there are no unique ground truth labels available due to disagreement among experts during annotation. In this article, I want to share the slides of my talk.
I had the pleasure to present our work on evaluating and calibrating with uncertain ground truth at the seminar series of the PRECISE center at the University of Pennsylvania. Besides talking about our recent papers on evaluating AI models in health with uncertain ground truth and conformal prediction with uncertain ground truth, I also got to learn more about the research at PRECISE through post-doc and student presentations. In this article, I want to share the corresponding slides.
Conformal prediction uses a held-out, labeled set of examples to calibrate a classifier to yield confidence sets that include the true label with user-specified probability. But what happens if even experts disagree on the ground truth labels. Commonly, this is resolved by taking the majority voted label from multiple expert. However, in difficult and ambiguous tasks, the majority voted label might be misleading and a bad representation of the underlying true posterior distribution. In this paper, we introduce Monte Carlo conformal prediction which allows to perform conformal calibration directly against expert opinions or aggregate statistics thereof.
In supervised machine learning, we usually assume access to ground truth label for evaluation. In many applications, however, these ground truth labels are derived from expert opinions. Disagreement among these experts is typically ignored using simple majority voting or averaging. Unfortunately, this can have severe consequences by over-estimating performance or mis-guiding model selection. In our work presented in this article, we tackle this problem by introducing a statistical framework for aggregating expert opinions.
In September, I received the DAGM MVTec dissertation award 2023 for my PhD thesis. DAGM is the German association for pattern recognition and organizes the German Conference on Pattern Recognition (GCPR) which is Germany’s prime conference for computer vision and related research areas. I feel particularly honored by this award since my academic career started with my first paper published as part of the young researcher forum at GCPR 2015 in Aachen.