"Quick Shift and Kernel Methods for Mode Seeking", Vedaldi and Soatto • David Stutz

JANUARY2015

READING

A. Vedaldi, S. Soatto. Quick Shift and Kernel Methods for Mode Seeking. European Conference on Computer Vision, 2008.

COMPUTER VISION

Proposed in 2008, Quick Shift is a segmentation algorithm based on mode seeking which can also be used to generate superpixels. In general, a mode seeking algorithm starts from a Parzen density estimate $p(x_n)$ for all pixels $x_n \in \{1,\ldots,W\} \times \{1,\ldots,H\}$ (here, $W$ and $H$ are the width and height of the image, respectively) and each pixel is assigned to a mode by following the density $p(x_n)$ upwards, that is in the direction of the gradient. Therefore, Quick Shift may be categorized as gradient ascent method. In particular, Quick Shift pre-computes $p(x_n)$ for all pixels using a Gaussian kernel. In practice, given an image $I$, the distance $d(x_n, x_m)$ used within the Gaussian kernel consists of a color term and a spatial term:

$d(x_n, x_m) = α \| I(x_n)−I(x_m)\|_2 + \|x_n −x_m\|_2$

where $α$ weights the influence of the color term. Subsequently, each pixel $x_n$ gets assigned to the pixel $x_m \in N_R(x_n) = \{x_m : \|x_n −x_m\|_\infty \leq R/2\}$ such that $p(x_m) > p(x_n)$ or is left unassigned. These assignments correspond to the modes, which represent the final superpixels. The algorithm is summarized in algorithm 1.

function quickshift(
        $I$ // Color image.
    )
    for $n = 1$ to $W\cdot H$
        initialize $t(x_n) = 0$
    for $n = 1$ to $W\cdot H$
        // $N_R(x_n)$ is the set of all pixels in the neighborhood of size $N$ around $x_n$.
        calculate $p(x_n) = \sum_{x_m \in N_R(x_n)} \exp\left(\frac{-d(x_n, x_m)}{(2/3)R}\right)$
    for $n = 1$ to $W\cdot H$
        set $t(x_n) = \arg \max_{x_m \in N_R(x_n):p(x_m) > p(x_n)}\{p(x_m)\}$
    // $t$ can be interpreted as forest, where all pixels $x_n$ with $t(x_n) = 0$ are roots.
    derive superpixel segmentation $S$ from $t$
    return $S$

Algorithm 1: The superpixel algorithm Quick Shift.

The original implementation is part of the VLFeat Library. Generated superpixel segmentations can be found in figure 1.

Figure 1: Images from the Berkeley Segmentation Dataset [1] oversegmented using Quick Shift.

[1] P. Arbeláez, M. Maire, C. Fowlkes, J. Malik. Contour detection and hierarchical image segmentation. Transactions on Pattern Analysis and Machine Intelligence, volume 33, number 5, pages 898–916, 2011.

What is your opinion on this article? Let me know your thoughts on Twitter @davidstutz92 or LinkedIn in/davidstutz92.

IAM

DAVIDSTUTZ

READING

SEARCHTHEBLOG

ARCHIVES

TAGS