With the rising success of deep neural networks, their reliability in terms of robustness (for example, against various kinds of adversarial examples) and confidence estimates becomes increasingly important. Bayesian neural networks promise to address these issues by directly modeling the uncertainty of the estimated network weights. In this article, I want to give a short introduction of training Bayesian neural networks, covering three recent approaches.
Adversarial training is the de-facto standard to obtain models robust against adversarial examples. However, on complex datasets, a significant loss in accuracy is incurred and the robustness does not generalize to attacks not used during training. This paper introduces confidence-calibrated adversarial training. By forcing the confidence on adversarial examples to decay with their distance to the training data, the loss in accuracy is reduced and robustness generalizes to other attacks and larger perturbations.
Obtaining deep networks robust against adversarial examples is a widely open problem. While many papers are devoted to training more robust deep networks, a clear definition of adversarial examples has not been agreed upon. In this article, I want to discuss two very simple toy examples illustrating the necessity of a proper definition of adversarial examples.