Norman Mu, Justin Gilmer. MNIST-C: A Robustness Benchmark for Computer Vision. CoRR abs/1906.02337 (2019).

Mu and Gilmer introduce MNIST-C, an MNIST-based corruption benchmark for out-of-distribution evaluation. The benchmark includes various corruption types including random noise (shot and impulse noise), blur (glass and motion blur), (affine) transformations, “striping” or occluding parts of the image, using Canny images or simulating fog. These corruptions are also shown in Figure 1. The transformations have been chosen to be semantically invariant, meaning that the true class of the image does not change. This is important for evaluation as model’s can easily be tested whether they still predict the correct labels on the corrupted images.

Figure 1: Examples of the used corruption types included in MNIST-C.

Also find this summary on ShortScience.org.
What is your opinion on this article? Let me know your thoughts on Twitter @davidstutz92 or LinkedIn in/davidstutz92.