IAM

TAG»MACHINE LEARNING«

ARTICLE

On the Utility of Conformal Prediction Intervals

This article is meant as an ad-hoc response to Ben Recht’s recent blog series on whether we need conformal prediction intervals. I have been thinking a lot about the use of conformal prediction myself and this seems like a good opportunity to share some thoughts and learnings from working on conformal prediction the past few years.

More ...

JANUARY2024

PROJECT

Robustifying token attention for vision transformers.

More ...

JANUARY2024

PROJECT

Improve patch robustness of vision transformers.

More ...

ARTICLE

Vanderbilt Machine Learning Seminar Talk “Conformal Prediction under Ambiguous Ground Truth”

Last week, I presented our work on Monte Carlo conformal prediction — conformal prediction with ambiguous and uncertain ground truth — at the Vanderbilt Machine Learning Seminar Series. In this work, we show how to adapt standard conformal prediction if there are no unique ground truth labels available due to disagreement among experts during annotation. In this article, I want to share the slides of my talk.

More ...

ARTICLE

PRECISE Seminar Talk “Evaluating and Calibrating AI Models with Uncertain Ground Truth”

I had the pleasure to present our work on evaluating and calibrating with uncertain ground truth at the seminar series of the PRECISE center at the University of Pennsylvania. Besides talking about our recent papers on evaluating AI models in health with uncertain ground truth and conformal prediction with uncertain ground truth, I also got to learn more about the research at PRECISE through post-doc and student presentations. In this article, I want to share the corresponding slides.

More ...

ARTICLE

ArXiv Pre-Print “Evaluating AI Systems under Uncertain Ground Truth: a Case Study in Dermatology”

In supervised machine learning, we usually assume access to ground truth label for evaluation. In many applications, however, these ground truth labels are derived from expert opinions. Disagreement among these experts is typically ignored using simple majority voting or averaging. Unfortunately, this can have severe consequences by over-estimating performance or mis-guiding model selection. In our work presented in this article, we tackle this problem by introducing a statistical framework for aggregating expert opinions.

More ...

SEPTEMBER2023

PROJECT

Achieving accuracy, fair and private image classification.

More ...

SEPTEMBER2023

PROJECT

Keeping track of generated images using watermarking.

More ...

ARTICLE

Proper Robustness Evaluation of Confidence-Calibrated Adversarial Training in PyTorch

Properly evaluating defenses against adversarial examples has been difficult as adversarial attacks need to be adapted to each individual defense. This also holds for confidence-calibrated adversarial training, where robustness is obtained by rejecting adversarial examples based on their confidence. Thus, regular robustness metrics and attacks are not easily applicable. In this article, I want to discuss how to evaluate confidence-calibrated adversarial training in terms of metrics and attacks.

More ...