"Multiscale Combinatorial Grouping", Arbeláez, Pont-Tuset, Barron, Marques, Malik • David Stutz

OCTOBER2016

READING

P. Arbeláez J. Pont-Tuset, J. T. Barron, F. Marqués, J. Malik. Multiscale Combinatorial Grouping.Conference on Computer Vision and Pattern Recognition, 2014.

COMPUTER VISION

Arbeláez et al. propose a novel hierarchical approach to image segmentation where the lowest level is based on the well-known normalized cuts algoithms [1]. While the approach itself produces promising results, I was most interested in the proposed "fast" normalized cuts algorithms.

As derived in [1], the normalized cuts algorithms involves solving for the eigenvalues of the Laplacian $A = D - W$ where $W$ is the weight matrix of the weighted graph corresponding to the image and $D$ is the diagonal matrix holding the total outgoing weights of each pixel. For producing an oversegmentation, the $k$ smallest eigenvalues and the corresponding eigenvectors are sought for. With high resolution images, this problem is computationally challenging (not to say infeasible). However, Arbeláez et al. propose an efficient variant shown in Algorithm 1.

function dncuts(
        $A$, // Laplacian/affinity matrix
        $D$, // Number of downsampling
        $k$, // Number of eigenvalues
    )
    $A_0 := A$
    for $d = 1, \ldots, D$
        // Subsampling $A_{d - 1}$.
        $i_d$ := pixel_decimate($A_{d - 1}$)
        // Efficient squaring of the subsampled matrix, and normalizing.
        $B_d := A_{d - 1}[:, i_d]$
        $C_d := \text{diag}(B_d 1)^{-1} B_d$
        $A_d := C_d^T B_d$
    $X_D :=$ ncuts($A_D$, $k$)
    for $d = D, \ldots, 1$
        $X_{d - 1} := C_d X_d$

Algorithm 1: Efficient normalized cuts by downsampling.

dncuts (for downsampled normalized cuts) is motivated by two observations:

eigenvectors of $A$ and $A^2$ are the same;
eigenvectors of $A$ should be similar to those of a downsampled version of $A$.

The algorithm proceeds as follows: given $A$, it is subsampled by taken only every other pixel (this is done via pixel_decimate which returns every oher pixel index). Let $i$ denote the subsampled indices, then $A[i, i]$ denotes the subsampled matrix. As this results in neighboring pixels being disconneted, Arbeléz et al. consider $A[i,i]^2$. The eigenvectors of $A[i,i]^2$ are computed and upsampled by multiplying by $A[:,i]$. In algorithm 1, this procedure is repeated $D$ times.

[1] J. Shi, J. Malik. Normalized Cuts and Image Segmentationn. Conference on Computer Vision and Pattern Recognition, 1997.

What is your opinion on this article? Let me know your thoughts on Twitter @davidstutz92 or LinkedIn in/davidstutz92.

IAM

DAVIDSTUTZ

READING

SEARCHTHEBLOG

ARCHIVES

TAGS