Deep neural network (DNN) accelerators are specialized hardware for inference and have received considerable attention in the past years. Here, in order to reduce energy consumption, these accelerators are often operated at low voltage which causes the included accelerator memory to become unreliable. Additionally, recent work demonstrated attacks targeting individual bits in memory. The induced bit errors in both cases can cause significantly reduced accuracy of DNNs. In this paper, we tackle both random (due to low-voltage) and adversarial bit errors in DNNs. By explicitly taking such errors into account during training, wecan improve robustness significantly.
Many papers and theses provide high-level overviews of the proposed methods. Nowadays, in computer vision, natural language processing or similar research areas strongly driven by deep learning, these illustrations commonly include architectures of the used (convolutional) neural network. In this article, I want to provide a collection of examples using LaTeX and TikZ to produce nice figures of (convolutional) neural networks. All the discussed examples can also be found on GitHub.
While robustness against imperceptible adversarial examples is well-studied, robustness against visible adversarial perturbations such as adversarial patches is poorly understood. In this pre-print, we present a practical approach to obtain adversarial patches while actively optimizing their location within the image. On Cifar10 and GTSRB, we show that adversarial training on these location-optimized adversarial patches improves robustness significantly while not reducing accuracy.
PyTorch, alongside TensorFlow, has become standard among deep learning researchers and practitioners. While PyTorch provides a large variety in terms of tensor operations or deep learning layers, some specialized operations still need to be implemented manually. In cases where runtime is crucial, this should be done in C or CUDA for supporting both CPU and GPU computation. In this article, I want to provide a simple example and framework for extending PyTorch with custom C and CUDA operations using CFFI for Python and CuPy.
Training on adversarial examples generated on-the-fly, so-called adversarial training, improves robustness against adversarial examples while incurring a significant drop in accuracy. This apparent trade-off between robustness and accuracy has been observed on many datasets and is argued to be inherent to adversarial training — or even unavoidable. In this article, based on my recent CVPR’19 paper, I show experimental results indicating that adversarial training can achieve the same accuracy as normal training, if more training examples are available. This suggests that adversarial training has higher sample complexity.
As outlined in previous articles, there seems to be a significant difference between regular, unconstrained adversarial examples and adversarial examples constrained to the data manifold. In this article, I want to demonstrate that adversarial training with on-manifold adversarial examples has the potential to improve generalization if the manifold is known or approximated well enough. As alternative, for more complex datasets, knowledge of parts of the manifold is sufficient, leading to a kind of adversarial data augmentation using affine transformations.
Adversarial examples are commonly assumed to leave the manifold of the underyling data — although this has not been confirmed experimentally so far. This means that deep neural networks perform well on the manifold, however, slight perturbations in directions leaving the manifold may cause mis-classification. In this article, based on my recent CVPR’19 paper, I want to empirically show that adversarial examples indeed leave the manifold. For this purpose, I will present results on a synthetic dataset with known manifold as well as on MNIST with approximated manifold.
The code for my latest paper on confidence-calibrated adversarial training has been released on GitHub. The repository does not only include a PyTorch implementation of confidence-calibrated adversarial training, but also several white- and black box attacks to generate adversarial examples and the proposed confidence-thresholded robust test error. Furthermore, these implementations are fully tested and allow to reproduce the results from the paper. This article gives an overview of the repository and highlights its features and components.
Recently, I had the opportunity to present my work on confidence-calibrated adversarial training at the Bosch Center for Artifical Intelligence and the University of Tübingen, specifically, the newly formed Tübingen AI Center. As part of the talk, I outlined the motivation and strengths of confidence-calibrated adversarial training compared to standard adversarial training: robustness against previously unseen attacks and improved accuracy. I also touched on the difficulties faced during robustness evaluation. This article provides the corresponding slides and gives a short overview of the talk.
Adversarial training yields robust models against a specific threat model. However, robustness does not generalize to larger perturbations or threat models not seen during training. Confidence-calibrated adversarial training tackles this problem by biasing the network towards low-confidence predictions on adversarial examples. Through rejecting low-confidence (adversarial) examples, robustness generalizes to various threat models, including L2, L1 and L0 while training only on L∞ adversarial examples. This article gives a short abstract, discusses relevant updates to the previous version and includes paper and appendix.