Check out the latest superpixel benchmark — Superpixel Benchmark (2016) — and let me know your opinion! @david_stutz


Haoqiang Fan, Hao Su, Leonidas J. Guibas. A Point Set Generation Network for 3D Object Reconstruction from a Single Image. CoRR abs/1612.00603, 2016.

Fan et al. introduce point set generating networks – closely related and based on the PointNet idea []. Tackling the problem of single-image 3D reconstruction, they make two major contributions: defining and discussing suitable reconstruction losses allowing to compare two point clouds; and extending the chosen loss to account for uncertainty. In general, they consider a model of the form

$S = G(I, r; \theta)$

where $S$ is the predicted point cloud, $I$ the input image (e.g. with depth) and $r$ a random variable perturbing the input (e.g. $r \sim \mathcal{N}(0,1)$). The vanilla (baseline) model they propose is illustrated in Figure 1.

Figure 1: Vanilla architecture consisting of a convolutional encoder, and a predictor, which essentially is a PointNet [].

Regarding the loss, they propose both the Chamfer distance and the Earth Mover Distance:

$D_{CD}(S_1, S_2) = \sum_{x \in S_1} \min_{y \in S_2} \|x – y\|_2^2 + \sum_{y \in S_2} \min_{x \in S_1} \|x – y\|_2^2$

$D_{EMD}(S_1, S_2) = \min_{\phi} \sum_{x \in S_1} \|x - \phi(x)\|_2$

where, for the Earth Mover Distance, $\phi$ is a bijection between the two point sets which essentially solves the assignment problem. For this, they use an approximation for efficiency.

However, the uncertainty (also modeled through the random variable $r$) is not taken into account. Therefore, they adapt the loss to state the overall optimization problem over the parameters $\theta$ of the model as

$\min_\theta \sum_k \min_{r_j \sim N(0,1), 1 \leq j \leq n} \{d(G(I_k, r_j;\theta), S_k)\}$

where $S_k$ is the ground truth corresponding to image $I_k$. The loss is called the Min-of-N loss as it considers the minimum of $n$ randomized predictions.

They provide experimental results on various tasks, including shape completion from RGBD images where qualitative results can be found din Figure 2.

Figure 2: Qualitative results for the task of shape completion.
  • [] Charles Ruizhongtai Qi, Hao Su, Kaichun Mo, Leonidas J. Guibas. PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation. CoRR abs/1612.00593, 2016.

What is your opinion on the summarized work? Or do you know related work that is of interest? Let me know your thoughts in the comments below or get in touch with me: