IAM

FEBRUARY2015

READING

M. Y. Lui, O. Tuzel, S. Ramalingam, R. Chellappa. Entropy rate superpixel segmentation. Conference on Computer Vision and Pattern Recognition, 2011.

Another graph-based method for superpixel segmentation was proposed by Lui et al. Using greedy optimization, summarized in algorithm 1, an objective function based on the entropy rate of a random walk on the graph $\hat{G} = (V,M)$ with $M \subseteq E$ is proposed (where we interpret the image $I$ as 4-connected graph $G = (V,E)$):

$E(\hat{G}) = H(\hat{G}) + \lambda B(\hat{G})$

where $H(\hat{G})$ refers to the entropy rate of the randon walk, while $B(\hat{G})$ defines a balancing term. The objective is maximized subject to the constraint that the number of connected components in $\hat{G}$ is equal or lower to the desired number of superpixels $K$. Given weights $w_{n,m}$ between pixels $x_n$ and $x_m$, defined using a Gaussian kernel based on the $L_1$ color distance, $H(\hat{G})$ is defined as:

$H(\hat{G}) = - \sum_{n = 1}^N \frac{w_n}{\sum_{m = 1}^N w_m} \sum_{m = 1}^N p_{n,m} \log(p_{n,m})$

where we set $w_n = \sum_{m = 1}^N w_{n,m}$ and $N$ denotes the number of pixels. Here, the probabilities $p_{n,m}$ refer to the transition probabilities of the random walk and can be calculated as

$p_{n,m} = \frac{w_{n,m}}{w_n}$ if $n \neq m$ and $(n,m) \in M$,
$p_{n,m} = 1 - \frac{\sum_{m = 1:(n,m) \in M}^N w_{n,m}}{w_n}$ if $n \neq m$ and $(n,m) \notin M$,
$p_{n,m} = 0$ otherwise.

The balacing term favors superpixels of approximately the same size (regarding the number of pixels):

$B(\hat{G}) = - \sum_{i = 1}^K \frac{|S_i|}{N} log\left(\frac{|S_i|}{N}\right)$.

where $S_i$ denotes the $i^\text{th}$ superpixel. Starting from an initial superpixels segmentation where each pixel forms its own superpixel, the algorithm greedily adds edges to merge superpixels until the desired number of superpixels is reached, see algorithm 1.

function ers(
        $G = (V,E)$ // Undirected weighted graph.
    )
    initialize $M = \emptyset$
    for each edge $(n,m) \in E$
        // Let $\hat{G}$ denote the graph $(V, M \cup \{(n,m)\})$:
        let $(n,m)$ be the edge yielding the largest gain in the energy $E(\hat{G})$
        if $\hat{G}$ contains $K$ connected components or less:
            $M := M \cup \{(n,m)\}$.
    derive superpixel segmentation $S$ from $\hat{G}$
    return $S$
Algorithm 1: The greedy optimization scheme used to maximize $E(\hat{G})$.

The implementation is availabel online at mingyuliu.net. Figure 1 shows several images from the Berkeley Segmentation Dataset [1] oversegmented using the approach described above.

ba_bsd_4_ers ba_bsd_5_ers ba_bsd_6_ers ba_bsd_7_ers ba_bsd_8_ers

Figure 1 (click toenlarge): Images from the Berkeley Segmentation Dataset [1] generated using the approach proposed by Lui et al.
  • [1] P. Arbeláez, M. Maire, C. Fowlkes, J. Malik. Contour detection and hierarchical image segmentation. Transactions on Pattern Analysis and Machine Intelligence, volume 33, number 5, pages 898–916, 2011.
What is your opinion on this article? Let me know your thoughts on Twitter @davidstutz92 or LinkedIn in/davidstutz92.