IAM

OPENSOURCEFAN STUDYING
STUDYINGCOMPUTERSCIENCEANDMATH COMPUTERSCIENCE

Check out the latest superpixel benchmark — Superpixel Benchmark (2016) — and let me know your opinion! @david_stutz
19thMARCH2015

READING

C. Conrad, M. Mertz, R. Mester. Contour-relaxed superpixels. Energy Minimization Methods in Computer Vision and Pattern Recognition, 2013.

This paper introduces Contour Relaxed Superpixels, a statistical approach to superpixel segmentation. In particular, the value $I_c(x_n)$ of pixel $x_n$ in channel $c$ is assumed to be the outcome of stochastic process described by the parameters $\theta_{s(x_n), c}$ of the corresponding superpixel $S_{s(x_n)}$ where $s(x_n)$ denotes the superpixel $x_n$ belongs to. Using $\theta$ to denote the set of all such parameters, the superpixel segmentation $S$ maximizing $p(S, \theta | I)$ is searched for. Using bayes theorem, and omitting the normalization factor, the energy to be maximized is given by

$p(S, \theta | I) = \frac{p(I | S, \theta) p(S, \theta)}{p(I)} \propto p(I |S, \theta) p(S, \theta) =: E(S)$.

The parameters $\theta$ are considered deterministic parameters such that $p(S, \theta)$ can be simplified to $p(S, \theta) = \kappa p(S)$. An EM-style (e.g. see [1, p. 423ff]) algorithm is applied: The parameters $\theta$ are optimized using maximum likelihood considering the superpixel segmentation to be constant followed by optimizing for $S$ while the parameters $\theta$ are held constant. Considering each pixel connected to $8$ neighbors, $p(S)$ is modeled using a Gibbs Random Field and can be factorzed into

$p(S) = \kappa' \exp(-N_e C_e - N_v C_v)$

where only the second factor depends on the label of the pixel $x_n$. Here, $N_e$ is the number of direct neighbors of $x_n$ having a different label and $N_v$ is the number of diagonal neighbors with a different label ‐ $C_e$ and $C_v$ are the associated costs. Furtermore, the probability $p(I | S, \theta)$ can be factorized as

$p(I | S, \theta) = \prod _{S_i \in S}\prod_{x_n \in S_i} \prod_{c=1}^3 p(I_c(x_n) | \theta_{i,c})$

where all the stoachstic processes are considered independent.

The optimization is carried out as follows: Iteratively, each boundary pixel $x_n$ is considered to change its label. When $\theta$ is assumed constant, $x_n$ is assigned the label maximizing

$E(S) = \kappa \kappa' \exp(-N_e C_e - N_v C_v) \prod_{S_i} \prod_{x_m \in S_i}\prod_{c=1}^3 p(I_c(x_m)|\theta_{i,c})$.(1)

This approach can be categorized as gradient ascent approach to superpixel segmentation and is summarized in algorithm 1. The first product in equation (1) runs over all superpixels $S_i$ to which pixel $x_n$ may be assigned.

function crs(
        $I$, // Color image.
        $K$ // Number of superpixels.
    )
    // The step size R can be derived from the image size W ×H and K:
    initialize $S$ as regular grid with step size $R$
    initialize $\theta$ using sufficient statistics (e.g. Gaussian)
    for $t = 1$ to $T$
        // Originally, the image is traversed multiple times using different
        // directions to avoid a directional bias:
        for $n = 1$ to $N$
            if $x_n$ is a boundary pixel
                // This can be evaluated by taking θ as constant; Conrad et al.
                // suggest to minimize the negative logarithm of (1) instead:
                assign $x_n$ to the label maximizing equation (1)
    return $S$
Algorithm 1: Countour relaxed superpixels as iterative energy maximization.

A C++ implementation of Contour Relaxed Superpixels is available at the project's webpage. Superpixel segmentations generated using the described approach are shown in figure 1.

ba_bsd_4_crs ba_bsd_5_crs ba_bsd_6_crs ba_bsd_7_crs ba_bsd_8_crs

Figure 1 (click to enlarge): Images from the Berkeley Segmentation Dataset [2] oversegmented using Contour Relaxed Superpixels.
  • [1] C. Bishop. Pattern Recognition and Machine Learning. Springer Verlag, New York, 2006.
  • [2] P. Arbeláez, M. Maire, C. Fowlkes, J. Malik. Contour detection and hierarchical image segmentation. Transactions on Pattern Analysis and Machine Intelligence, volume 33, number 5, pages 898–916, 2011.

What is your opinion on the summarized work? Or do you know related work that is of interest? Let me know your thoughts in the comments below or using the following platforms: