Lee et al. introduce contour-constrained superpixels (CCS), a superpixel algorithm closely related to SEEDS  that hierarchically derives superpixels following object contours. In particular, the general setup follows : As illustrated in Figure 1, several hierarchies are considered; at each level, blocks or pixels can be exchanged between neighboring pixels in order to (locally) minimize an energy. In contrast to , however, the hierarchical partition is not uniform. Instead, a higher-level block is only split if its content is found to be inhomogeneous. Then, a subblock might change its label when it minimizes a composite energy consisting of four terms: feature/color distance to the corresponding centroids, boundary length cost (i.e. length of the superpixel boundaries), inter-region color cost (i.e. the maximum difference of colors within a superpixel) and a contour constraint cost. While the former costs are mostly self-explanatory (see the paper for details), the latter is more interesting and maybe the main contribution of the paper.
Figure 1: Illustration of the hierarchical approach to superpixel segmentation similar to . Note that in a particular step, e.g. level 1-3, not all blocks are subdivided.
In order to define a contour cost between distance, i.e. not neighboring, pixels, a set of contour patterns is “learned” (or collected) from the BSDS500 ground truth contours. This is illustrated in Figure 2. Detected edges are then mapped to these pattern which subsequently define a probability of both pixels belonging to the same object (illustrated in Figure 2 bottom). In particular, all patterns matching the observed edges are retrieved and the probability is taken as the fraction in which the pixels belong to the same object. Based on the probability, a cost function is defined.
Figure 2: Illustration of the pattern matching to determine a contour cost. a) the original image, b) the derived edges, c) the ground truth edges over the BSDS500 training set, d) maximum surpressed edges, e) set of contours extracted from the training set, f) contour matching. Below, an example is shown where multiple patterns are retrieved and implicitly define the contour cost.
In experiments, the authors also apply the proposed method to supervoxel generation in videos (i.e. temporal superpixels). Figure 3 shows qualitative results on images. Quantitatively, they show that their method outperforms existing ones, including SEEDS – but only marginally. As details on the experimental details are missing (and the number of superpixels used is always a multiple of 100, which is impossible for some of the algorithms), this improvement is not significant – at least in my opinion.
What is your opinion on the summarized work? Or do you know related work that is of interest? Let me know your thoughts in the comments below: