Another alternative to the regular $L_p$-constrained adversarial examples that is additionally less visible than adversarial patches or frames are adversarial transformations such as small crops, rotations and translations. Similar to $L_p$ adversarial examples, adversarial transformations are often less visible unless the original image is available for direct comparison. In this article, I will include a PyTorch implementation and some results against adversarial training.
Adversarial patches and frames are an alternative to the regular $L_p$-constrained adversarial examples. Often, adversarial patches are thought to be more realistic — mirroring graffitis or stickers in the real world. In this article I want to discuss a simple PyTorch implementation and present some results of adversarial patches against adversarial training as well as confidence-calibrated adversarial training.
Out-of-distribution examples are images that are cearly irrelevant to the task at hand. Unfortunately, deep neural networks frequently assign random labels with high confidence to such examples. In this article, I want to discuss an adversarial way of computing high-confidence out-of-distribution examples, so-called distal adversarial examples, and how confidence-calibrated adversarial training handles them.
Properly evaluating defenses against adversarial examples has been difficult as adversarial attacks need to be adapted to each individual defense. This also holds for confidence-calibrated adversarial training, where robustness is obtained by rejecting adversarial examples based on their confidence. Thus, regular robustness metrics and attacks are not easily applicable. In this article, I want to discuss how to evaluate confidence-calibrated adversarial training in terms of metrics and attacks.
Taking adversarial training from this previous article as baseline, this article introduces a new, confidence-calibrated variant of adversarial training that addresses two significant flaws: First, trained with L∞ adversarial examples, adversarial training is not robust against L2 ones. Second, it incurs a significant increase in (clean) test error. Confidence-calibrated adversarial training addresses these problems by encouraging lower confidence on adversarial examples and subsequently rejecting them.
OPEN SOURCE Bit Error Robustness in PyTorch Article Series I was planning to have an article series on bit error robustness in deep learning — similar to my article series on adversarial robustness — with accompanying PyTorch code. However, the recent progress in machine learning made me focus on other projects. Nevertheless, the articles should […]
Series of articles discussing adversarial robustness and adversarial training in PyTorch.
Report of the 2020 Max Planck PhDNet survey results.
Knowing how to compute adversarial examples from this previous article, it would be ideal to train models for which such adversarial examples do not exist. This is the goal of developing adversarially robust training procedures. In this article, I want to describe a particularly popular approach called adversarial training. The idea is to train on adversarial examples computed during training on-the-fly. I will also discuss a PyTorch implementation that obtains 47.9% robust test error — 52.1% robust accuracy — on CIFAR10 using a WRN-28-10 architecture.
With our paper on conformal training, we showed how conformal prediction can be integrated into end-to-end training pipelines. There are so many interesting directions of how to improve and build upon conformal training. Unfortunately, I just do not have the bandwidth to pursue all of them. So, in this article, I want to share some research ideas so others can pick them up.