Check out our CVPR'18 paper on weakly-supervised 3D shape completion — and let me know your opinion! @david_stutz


Christian Wilms, Simone Frintrop. Edge Adaptive Seeding for Superpixel Segmentation. GCPR, 2017.

Finding appropriate superpixel seeds, allowing to adjust the density of superpixels automatically based on the image content is a known problem. Existing approaches adapt the superpixel density based on saliency [10] or depth [28]. The authors of this paper propose to use edge information for adaptive seeding. The approach can be summarized as follows. The structured forest edge detector [6,7] is used to detect edges. These are thresholded and then smoothed using a Gaussian. The result is clustered using k-means to define a fixed number of clusters $K$. For each cluster $k$, a superpixel algortihm is run with a specified number of desired superpixels depending on the average edge density in the cluster. In particular, the total number of superpixels $n$ follows

$n = \sum_{k = 1}^K a_k (n_1 + (n_K – n_1)\frac{w_k}{w_K})$ (1)

where $n_1$ is the number of superpixels in the coarsest resolution and $n_K$ is the number of superpixels in the fines resolution. Furthermore, $a_k$ is the relative size of the k_th cluster – this is later used to pre-compute $n_k$, the number of superpixels for the $k$-th resolution, given an overall number of superpixels $n$. The weights $w_k$ are defined by

$w_k = \begin{cases}1 – b^{\frac{e_k – e_1}{e_K – e_1}} &\quad\text{ if } b < 1\\ b^{\frac{e_k – e_1}{e_K – e_1} – 1}&\quad\text{ else}\end{cases}$ (2)

with $b$ being a hyper parameters. Solving for $n_K$ in Equation (2) gives:

$n_K = \frac{N-1 \sum_{k = 1}^K a_k(1 - \frac{w_k}{w_K}) – n}{-\sum_{k = 1}^K a_k\frac{w_k}{w_K}}$

The remaining $n_k$ can then be calculated as

$n_k = n_1 + (n_K – n_1)\frac{w_k}{w_K}$.

Intuitively, $n_1$ is fixed in advance, i.e. the coarsest resolution is pre-defined. Then, $n_K$ is defined as above and all the intermediate $n_k$ can be computed. Here, the weight $w_k$ increases if the average edge density in cluster $k$ is higher. As a result, $n_k$ becomes greater relative to the provided $n_1$. The described procedure is also summarized in Figure 1.

Figure 1: Illustration of the process; from left to right: image with ground truth, thresholded edges, smoothed edges, clustering based on edge density, generate superpixel segmentation.

In a post-processing step, after the algorithm is run $K$ times, the $K$ different segmentations need to be combined. The problem of naively combining the different superpixel segmentations based on the clusters is illustrated in Figure 2. Instead, they always choose the full superpixels of the finer superpixel segmentation across cluster borders. Afterwards, connected components are found as in [1] and small superpixels are merged into neighboring ones until the desired number of superpixels is met.

Figure 2: Illustration of the post-processing problem of combining superpixel segmentations. From left to right: original image, clustering based on average edge density, naive combination of superpixel segmentations, proposed method.

In experiments they apply the proposed method to SLIC [1] and SEEDS [5] based on our benchmark [23]. They also discuss parameter optimization but do not give the chosen parameters. Also they do not report the parameters of SLIC and SEEDS for the different densities. From the qualitative results in Figure 3, however, I suspect that different paramters are used for the different densities. In particular, the last two columns showing plain SLIC versus the proposed method reveals that different compactness parameters might have been used.

Figure 3: Qualitative results on the BDS500 (top two rows) and the NYUV2 datasets (last row). From left to right: image and ground truth, smoothed edges, clustering, SLIC, and SLIC with proposed seeding.

In numbers, regarding Boundary Recall and Undersegmentation Error, they improve Boundary Recall with the adaptive version of SLIC but do not show improvements in Undersegmentation Error. For SEEDS, it is contrary – Boundary Recall is not improve while Undersegmentation Error is improved. They also show, using the ground truth edges as input, that the approach is able to produce significant increases in performance given a reliable edge detector. Unfortunately, they do not show experiments with different edge detectors. They also do not report runtimes.

  • [1] Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., S ̈usstrunk, S.: SLIC superpixels compared to state-of-the-art superpixel methods. IEEE TPAMI 34(11), 2274–2282 (2012)
  • [5] Van den Bergh, M., Boix, X., Roig, G., de Capitani, B., Van Gool, L.: SEEDS: Superpixels extracted via energy-driven sampling. In: ECCV (2012)
  • [6] Dollár, P., Zitnick, C.L.: Structured forests for fast edge detection. In: ICCV (2013).
  • [7] Dollár, P., Zitnick, C.L.: Fast edge detection using structured forests. IEEE TPAMI 37(8), 1558–1570 (2015).
  • [10] Gao, G., Lauri, M., Zhang, J., Frintrop, S.: Saliency-guided adaptive seeding for supervoxel segmentation. In: accepted for IROS (2017).
  • [23] Song, S., Lichtenberg, S.P., Xiao, J.: SUN RGB-D: A RGB-D scene understanding benchmark suite. In: CVPR (2015)
  • [28] Wang, S., Lu, H., Yang, F., Yang, M.H.: Superpixel tracking. In: ICCV (2011)

What is your opinion on the summarized work? Or do you know related work that is of interest? Let me know your thoughts in the comments below or get in touch with me: