After introducing the mathematics of variational auto-encoders in a previous article, this article presents an implementation in LUA using Torch. The main challenge when implementing variational auto-encoders are the Kullback-Leibler divergence as well as the reparameterization sampler. Here, both are implemented as separate
A variational auto-encoder trained on corrupted (that is, noisy) examples is called denoising variational auto-encoder. While easily implemented, the underlying mathematical framework changes significantly. As the second article in my series on variational auto-encoders, this article discusses the mathematical background of denoising variational auto-encoders.
In the third article of my series on variational auto-encoders, I want to discuss categorical variational auto-encoders. This variant allows to learn a latent space of discrete (e.g. categorical or Bernoulli) latent variables. Compared to regular variational auto-encoders, the main challenge lies in deriving a working reparameterization trick for discrete latent variables — the so-called Gumbel trick.
As part of my master thesis, I made heavy use of variational auto-encoders in order to learn latent spaces of shapes — to later perform shape completion. Overall, I invested a big portion of my time in understanding and implementing different variants of variational auto-encoders. This article, a first in a small series, will deal with the mathematics behind variational auto-encoders. The article covers variational inference in general, the concrete case of variational auto-encoder as well as practical considerations.
Recently proposed neural network architectures, including PointNets and PointSetGeneration networks, allow deep learning on unordered point clouds. In this article, I present a Torch implementation of a PointNet auto-encoder — a network allowing to reconstruct point clouds through a lower-dimensional bottleneck. As loss during training, I implemented a symmetric Chamfer distance in C/CUDA and provide the code on GitHUb.
Adversarial examples are test images which have been perturbed slightly to cause misclassification. As these adversarial examples are usually unproblematic for us humans, but are able to easily fool deep neural networks, their discovery has sparked quite some interest in the deep learning and privacy/security communities. In this article, I want to provide a rough overview of the topic including a brief survey of relevant literature and some ideas on future research directions.
In 3D vision, a common problem involves the comparison of meshes. In 3D reconstruction or surface reconstruction, triangular meshes are usually compared considering accuracy and completeness — the distance from the reconstruction to the reference and vice-versa. In this article, I want to present an efficient C++ tool for computing accuracy and completeness considering both references meshes as well as reference point clouds.
Triangular meshes are commonly used to represent various shapes in computer graphics and computer vision. However, for various deep learning techniques, triangular meshes are not well suited. Therefore, meshes are commonly voxelized into occupancy grids or signed distance functions. This article presents a C++ tool allowing efficient voxelization of (watertight) meshes.
Automatically obtaining high-quality watertight meshes in order to derive well-defined occupancy grids or signed distance functions is a common problem in 3D vision. In this article, I present a mesh fusion approach for obtaining watertight meshes. In combination with a standard mesh simplification algorithm, this approach produces high-quality, but lightweight, watertight meshes.