Abhishek Sharma, Oliver Grau, Mario Fritz. VConv-DAE: Deep Volumetric Shape Learning Without Object Labels. ECCV Workshops, 2016.

Sharma et al. use a volumetric convolutional denoising auto-encoder for shape completion and classification on the ModelNet [] Dataset. Their approach is comparably simple — the extension of the regular denoising auto-encoder to volumetric data is straight forwards on the used resolution of $30^3$. The architecture is quite simple and illustrated in Figure 1. A dropout layer after the input simulates noise and a bottleneck layer of dimensionality $6912$ is used.


Figure 1 (click to enlarge): The volumetric, convolutional denoising auto-encoder architecture used by Sharma et al.


In experiments on the ModelNet [] Dataset, they demonstrate the applicability of their model for classification and shape completion. For classification they outperform the ShapeNet model [] both when training an SVM on the $6912$ dimensional representation and when fine-tuning by adding two additional fully connected layers. However, VoxNet [] still outperforms their approach. Qualitative results for shape completion on random noise and slicing noise are shown in Figure 2 and 3, respectively. It seems as if the model struggles most with the low resolution. Especially for slicing noise, the model performs poorly because of the low resolution.


Figure 2 (click to enlarge): Qualitative results for shape completion from random noise. Comparison to ShapeNet [].

Figure 3 (click to enlarge): Qualitative results of shape completion on slicing noise and comparison to ShapeNet [].

  • [] Wu, Z., Song, S., Khosla, A., Yu, F., Zhang, L., Tang, X., Xiao, J.: 3d shapenets: A deep representation for volumetric shapes. In: CVPR. (2015).
  • [] Su, H., Maji, S., Kalogerakis, E., Learned-Miller, E.G.: Multi-view convolutional neural networks for 3d shape recognition. In: Proc. ICCV. (2015).
What is your opinion on this article? Let me know your thoughts on Twitter @davidstutz92 or LinkedIn in/davidstutz92.