Ben Graham. Sparse 3D convolutional neural networks. BMVC, 2015.

Graham generalizes sparse convolutional neural networks previously considered in [1] to 3D data. His approach is twofold:

  1. A square grid in 2D can be replaced by a triangular grid. Similarly, in 3D, a cubic grid can be replaced by a tetrahedral grid, as illustrated in Figure 1. This scheme can be applied to convolutions and pooling and offers to reduce the complexity.
  2. On sparse input data, convolutions only need to consider non-zero entries. Graham, therefore, uses hashmaps to efficiently store and identify non-zero entries in the cubic/tetrahedral grid to speed up convolutions.

Graham conducts several experiments meant as proof of concept how these two techniques can be used to speed up 3D convolutional networks, see the paper.

Figure 1 (click to enlarge): Illustration of different grid representations used by Graham.

The idea with a tetrahedral grid seems interesting, but concerning the speeded up convolutions, the approach by Engelcke et al. [2] seems more elegant.

  • [1] B. Graham. Spatially-sparse convolutional neural networks. 2014.
  • [2] M. Engelcke, D. Rao, D. Z. Wang, C. H. Tong, I. Posner. Vote3Deep: Fast Object Detection in 3D Point Clouds Using Efficient Convolutional Neural Networks. CoRR, 2016.
What is your opinion on this article? Let me know your thoughts on Twitter @davidstutz92 or LinkedIn in/davidstutz92.