IAM

JANUARY2015

READING

A. Vedaldi, S. Soatto. Quick Shift and Kernel Methods for Mode Seeking. European Conference on Computer Vision, 2008.

Proposed in 2008, Quick Shift is a segmentation algorithm based on mode seeking which can also be used to generate superpixels. In general, a mode seeking algorithm starts from a Parzen density estimate $p(x_n)$ for all pixels $x_n \in \{1,\ldots,W\} \times \{1,\ldots,H\}$ (here, $W$ and $H$ are the width and height of the image, respectively) and each pixel is assigned to a mode by following the density $p(x_n)$ upwards, that is in the direction of the gradient. Therefore, Quick Shift may be categorized as gradient ascent method. In particular, Quick Shift pre-computes $p(x_n)$ for all pixels using a Gaussian kernel. In practice, given an image $I$, the distance $d(x_n, x_m)$ used within the Gaussian kernel consists of a color term and a spatial term:

$d(x_n, x_m) = α \| I(x_n)−I(x_m)\|_2 + \|x_n −x_m\|_2$

where $α$ weights the influence of the color term. Subsequently, each pixel $x_n$ gets assigned to the pixel $x_m \in N_R(x_n) = \{x_m : \|x_n −x_m\|_\infty \leq R/2\}$ such that $p(x_m) > p(x_n)$ or is left unassigned. These assignments correspond to the modes, which represent the final superpixels. The algorithm is summarized in algorithm 1.

function quickshift(
        $I$ // Color image.
    )
    for $n = 1$ to $W\cdot H$
        initialize $t(x_n) = 0$
    for $n = 1$ to $W\cdot H$
        // $N_R(x_n)$ is the set of all pixels in the neighborhood of size $N$ around $x_n$.
        calculate $p(x_n) = \sum_{x_m \in N_R(x_n)} \exp\left(\frac{-d(x_n, x_m)}{(2/3)R}\right)$
    for $n = 1$ to $W\cdot H$
        set $t(x_n) = \arg \max_{x_m \in N_R(x_n):p(x_m) > p(x_n)}\{p(x_m)\}$
    // $t$ can be interpreted as forest, where all pixels $x_n$ with $t(x_n) = 0$ are roots.
    derive superpixel segmentation $S$ from $t$
    return $S$
Algorithm 1: The superpixel algorithm Quick Shift.

The original implementation is part of the VLFeat Library. Generated superpixel segmentations can be found in figure 1.

ba_bsd_4_qs ba_bsd_5_qs ba_bsd_6_qs ba_bsd_7_qs ba_bsd_8_qs

Figure 1: Images from the Berkeley Segmentation Dataset [1] oversegmented using Quick Shift.
  • [1] P. Arbeláez, M. Maire, C. Fowlkes, J. Malik. Contour detection and hierarchical image segmentation. Transactions on Pattern Analysis and Machine Intelligence, volume 33, number 5, pages 898–916, 2011.
What is your opinion on this article? Let me know your thoughts on Twitter @davidstutz92 or LinkedIn in/davidstutz92.