IAM

ARTICLE

Efficient High-Quality Superpixels: SEEDS Revised

SEEDS Revised is a new implementation of the superpixel algorithm SEEDS [1] used for evaluation in my bachelor thesis “Superpixel Segmentation using Depth Information”. This article introduces the basic concepts of SEEDS as well as the usage of SEEDS Revised.

Overview

SEEDS Revised is an implementation of the superpixel algorithm SEEDS proposed in [1] and used for evaluation purposes in [2]. Based on color histograms, an initial superpixel segmentation is iteratively refined. The algorithm is able to generate high-quality superpixel segmentations in near realtime. The implementation is based on OpenCV (compatible with both OpenCV 2 and OpenCV 3), provides thorough documentation and a clean API. Further, a one-parameter variant requiring only the desired number of superpixels is provided:

cv::Mat image = cv::imread(filepath);
int superpixels = 800;
int iterations = 2;

SEEDSRevisedMeanPixels seeds(image, superpixels);
seeds.initialize();
seeds.iterate(iterations);

// Retrieve labels and put out contour image.
int** labels = seeds.getLabels();

SEEDS Revised on GitHub

The bachelor thesis [2] is available online: Superpixels / SEEDS

You may also go directly to Usage and Parameters instead of learning more about the actual algorithm.

Algorithm

Given an image $I: \{1, \ldots, W\} \times \{1, \ldots, H\} \rightarrow \{0, \ldots, 255\}^3$, the algorithm starts by grouping pixels into blocks of size $w^{(1)} \times h^{(1)}$. These blocks are then further arranged into blocks of $2 \times 2$. This scheme is applied recursively to form a hierarchy of blocks such that blocks at level $l$ have size

$w^{(l)} = w^{(1)} \cdot 2^{l-1}$ and $h^{(l)} = h^{(1)} \cdot 2^{l-1}$.

This is repeated for a total of $L$ levels. Therefore, the number of superpixels is calculated as follows:

$S = \frac{W}{w^{(L)}} \cdot \frac{H}{h^{(L)}}$

where we assume the divisions to be integer divisions. In practice, this means that the number of levels $L$ and the initial block size $w^{(1)} \times h^{(1)}$ need to be dervied from the desired number of superpixels $K$.

The initial superpixel segmentation is refined by exchanging blocks of pixels as well as single pixels between neighboring superpixels. Therefore, let $B_i^{(l)} \subseteq \{1,\ldots, W\}\times\{1,\ldots, H\}$ be a block of pixels at level $l$ and $S_j$ be a superpixel. With $h_{B_i^{(l)}}$ we denote the color histogram of block $B_i^{(l)}$. Then, the similarity of block $B_i^{(l)}$ and superpixel $S_j$ is given by the intersection of the corresponding histograms:

$\cap(h_{B_i^{(l)}}, h_{S_j}) = \sum_{q = 1}^Q \min\{h_{B_i^{(l)}}(q), h_{S_j}(q)\}$

where we assume the histograms to be normalized. Furthermore, the similarity of a single pixel $x_n$ and the superpixel $S_j$ is given by

$h_{S_j}(h(x_n))$

where $h(x_n)$ is the histogram bin of pixel $x_n$. Given this basic framework, algorithm 1 describes a simple implementation of SEEDS.

function SEEDS(
        $I$, // Color image.
        $w^{(1)} \times h^{(1)}$, // Initial block size.
        $L$, // Number of levels.
        $Q$ // Histogram size.
    )
    initialize the block hierarchy and the initial superpixel segmentation
    // Initialize histograms for all blocks and superpixels:
    for $l = 1$ to $L$
        // At level $l = L$ these are the initial superpixels:
        for each block $B_i^{(l)}$ 
            initialize histogram $h_{B_i^{(l)}}$
    // Block updates:
    for $l = L - 1$ to $1$
        for each block $B_i^{(l)}$
            let $S_j$ be the superpixel $B_i^{(l)}$ belongs to
            if a neighboring block belongs to a different superpixel $S_k$
                if $\cap(h_{B_i^{(l)}}, h_{S_k}) > \cap(h_{B_i^{(l)}}, h_{S_j})$
                    $S_k := S_k \cup B_i^{(l)}$, $S_j := S_j - B_i^{(l)}$
    // Pixel updates:
    for $n = 1$ to $W\cdot H$
        let $S_j$ be the superpixel $x_n$ belongs to
        if a neighboring pixel belongs to a different superpixel $S_k$
            if $h_{S_k}(h(x_n)) > h_{S_j}(h(x_n))$
                $S_k := S_k \cup \{x_n\}$, $S_j := S_j - \{x_n\}$
    return $S$
Algorithm 1: The basic algorithm of SEEDS. In practice, both block updates as well as pixel updates can be iterated more than once.

As alternative, the similarity of a pixel $x_n$ and a superpixel $S_j$ can be expressed as distance:

$d(x_n, S_j) = \|I(x_n) - I(S_j)\|_2$

where $I(S_j)$ is the mean color of the superpixel $S_j$. This variant of pixel updates is called mean pixel updates [1].

As smooth and compact superpixels are preferrable, van den Bergh et al. [1] introduced a smoothing term: For each pair of pixels $(x_n, x_m)$ (assuming that these pixels belong to different superpixels $S_j$ and $S_k$, respectively), we consider the $3 \times 4$ or $4\times 3$ neighborhood and count the number $N_{n,m,j}$ of pixels belonging to superpixel $S_j$ as well as the number $N_{n,m,k}$ of pixels belonging to superpixel $S_k$. Then, the criterion for pixel updates changes to

$N_{n,m,k}\cdot h_{S_k}(h(x_n)) > N_{n,m,j}\cdot h_{S_j}(h(x_n))$.

This way, we encourage individual pixels to belong to the same superpixel as its neighboring pixels and, thus, enforce smoothness. Of course, this approach can also be applied to mean pixel updates, however, mean pixel updates can be extendend to include an explicit compactness term by using the distance

$d(x_n, S_j) = \|I(x_n) - I(S_j)\|_2 + \beta \|x_n - \mu(S_j)\|_2$, $x_n \in \{1, \ldots, W\} \times \{1, \ldots, H\}$,(∗)

where $\mu(S_j)$ is the mean position of the superpixel $S_j$. The procedure given in algorithm 1 can easily be extended to include the above considerations, see [2].

Complexity

SEEDS runs linear in the total number $N = W\cdot H$ of pixels: SEEDS $\in \mathcal{O}(QTN)$ where $T$ is the number of iterations of block and pixel udpates and $Q$ the histogram size. This can easily be derived when considering to pre-compute the block and superpixel histograms - the histograms can then be updated whenever a block or pixel changes its label.

While the above statement is true, in practice, the runtime of SEEDS does also depend on the number of levels $L$ as well as the block size $w^{(1)} \times h^{(1)}$. Therefore, the runtime only depends indirectly on the number of superpixels $K$ - in particular, the runtime does not scale with $K$.

Usage and Parameters

Compiling

The implementation is based on CMake; e.g. on Ubuntu, CMake can be installed using

$ sudo apt-get install cmake

In addition, OpenCV is required (compatible with both OpenCV 2 and OpenCV 3). See the website for installation details. Then, the implementation can be compiled using

$ git clone https://github.com/davidstutz/seeds-revised.git
$ cd seeds-revised/build
$ cmake ..
$ make
$ ../bin/cli --help
Allowed options:
    --help                       produce help message
    --input arg                  the folder to process, may contain several 
                              images
    --bins arg (=5)              number of bins used for color histograms
    --neighborhood arg (=1)      neighborhood size used for smoothing prior
    --confidence arg (=0.11)     minimum confidence used for block update
    --iterations arg (=2)        iterations at each level
    --spatial-weight arg (=0.25) spatial weight
    --superpixels arg (=400)     desired number of supüerpixels
    --verbose                    show additional information while processing
    --csv                        save segmentation as CSV file
    --contour                    save contour image of segmentation
    --labels                     save label image of segmentation
    --mean                       save mean colored image of segmentation
    --output arg (=output)       specify the output directory (default is 
                              ./output)

Usage

The implementation provides two classes: SEEDSRevised and SEEDSRevisedMeanPixels where the latter implements the so called mean pixel updates (see above or [2]). The following example shows how to use the SEEDSRevisedMeanPixels class to oversegment a given image:

#include <opencv2/opencv.hpp>
#include "SeedsRevised.h"
#include "Tools.h"

// ...

cv::Mat image = cv::imread(filepath);

// Number of desired superpixels.
int superpixels = 800;

// Number of bins for color histograms (per channel).
int numberOfBins = 5;

// Size of neighborhood used for smoothing term, see [1] or [2].
// 1 will be sufficient, >1 will slow down the algorithm.
int neighborhoodSize = 1;

// Minimum confidence, that is minimum difference of histogram
// intersection needed for block updates: 0.1 is the default value.
float minimumConfidene = 0.1;

// The weighting of spatial smoothing for mean pixel updates - the
// euclidean distance between pixel coordinates and mean superpixel
// coordinates is used and weighted according to:
//  (1 - spatialWeight)*colorDistance + spatialWeight*spatialDistance
// The higher spatialWeight, the more compact superpixels are generated.
float spatialWeight = 0.25;

// Number of iterations at each level.
int iterations = 2;

// Instantiate a new object for the given image.
SEEDSRevisedMeanPixels seeds(image, superpixels, numberOfBins, neighborhoodSize, minimumConfidence, spatialWeight);

// Initializes histograms and labels.
seeds.initialize();
// Run block and pixel updates.
seeds.iterate(iterations);

// Save a contour image to the following location:
std::string storeContours = "./contours.png";

// BGR color for contours:
int bgr[] = {0, 0, 204};

// seeds.getLabels() returns a two-dimensional array containing the
// computed superpixel labels.
cv::Mat contourImage = Draw::contourImage(seeds.getLabels(), image, bgr);
cv::imwrite(store, contourImage);

Some details on the parameters:

  • image: the image in the form of a cv::Mat in BGR color space.
  • superpixels: number of desired superpixels.
  • numberOfBins: number of bins per channel used to calculate the color histograms (see above).
  • neighborhoodSize: size of the smoothing window as explained above; $1$ is usually sufficient, greater values will slow down the algorithm and result in smoother superpixels.
  • minimumConfidence: minimum difference of histogram intersection required for block updates.
  • spatialWeight: only available for SEEDSRevisedMeanPixels; used to weight the compactness term in (∗).

Qualitative and Quantitative Results

Figure 1 shows superpixel segmentations acquired on images from the Berkeley Segmentation Dataset [3].

bsd-test-4-oriseeds bsd-test-5-oriseeds bsd-test-6-oriseeds bsd-test-7-oriseeds bsd-test-8-oriseeds
bsd-test-4-oriseedsmp bsd-test-5-oriseedsmp bsd-test-6-oriseedsmp bsd-test-7-oriseedsmp bsd-test-8-oriseedsmp
bsd-test-4-reseeds bsd-test-5-reseeds bsd-test-6-reseeds bsd-test-7-reseeds bsd-test-8-reseeds
bsd-test-4-reseedsmp bsd-test-5-reseedsmp bsd-test-6-reseedsmp bsd-test-7-reseedsmp bsd-test-8-reseedsmp
bsd-test-4-reseedssm bsd-test-5-reseedssm bsd-test-6-reseedssm bsd-test-7-reseedssm bsd-test-8-reseedssm
Figure 1 (click to enlarge): Superpixel segmentations of images taken from the Berkeley Segmentation Dataset generated by: first row - original implementation of SEEDS; second row - original implementation of SEEDS with mean pixel updates; third row - SEEDS Revised; fourth row - SEEDS Revised with mean pixel updates; fifth row - SEEDS Revised with mean pixel updates and compactness term (∗).

Both SEEDS Revised, referred to as reSEEDS, and the original implementation, referred to as oriSEEDS, have been evaluated on training sets of the Berkeley Segmentation Dataset (blue) and the NYU Depth V2 Dataset (red) [4] using an extended version of the Berkeley Segmentation Benchmark [3] which is available on GitHub. Both implementations were evaluated with mean pixel updates and the neighborhood smoothing prior; reSEEDS* additionally uses the compactness term in equation (∗). Figure 2 compares both implementations and shows the influence of the additional compactness term as well as the histogram size. For additional comparison to other state-of-the-art approaches see Superpixels / SEEDS.

Figure 2 (click to enlarge): Evaluation results on the training sets of the Berkeley Segmentation Dataset (blue) and the NYU Depth Dataset V2 (red). $Rec$ denotes Boundary Recall, $UE$ denotes Undersegmentation Error, see here. Furthermore, $Q$ is the histogram size per channel, $\beta$ is the weighting term used in equation (∗).

Runtime

Table 1 compares the runtime of both implementations on the Berkeley Segmentation Dataset and the Nyu Depth V2 Dataset, see the caption for details.

Berkeley Segmentation Dataset, $N = 481 \times 321 = 154401$
$K$oriSEEDSreSEEDSreSEEDS*
2000.140450.067450.06955
4000.236050.1120.1156
6000.33540.140850.14665
8000.172850.083450.08505
10400.212950.095650.09985
NYU Depth V2 Dataset, $N = 608 \times 448 = 272384$
$K$oriSEEDSreSEEDSreSEEDS*
4500.302150.13860.14455
7000.413650.180350.18795
8400.216450.10850.11398
11000.25980.125150.13165
14000.30680.1410.14743
Table 1: Runtime in seconds of both implementations for the Berkeley Segmentation Dataset and the NYU Depth V2 Dataset. The following parameters were used: $T = 2$ iterations of block updates at each level; $2T = 4$ iterations of pixel updates; $Q = 7^3$.

The runtime of both implementations can be reduced by reducing the number of iterations $T$ as well as the histogram size $Q$. In addition, the neighborhood smoothing prior can be dropped - in particular, as reSEEDS* provides the additional compactness term of equation (∗), the neighborhood smoothing prior can be dropped without reducing compactness or smoothness of superpixels. Table 2 shows the runtime with $T = 1$ and $Q = 3^3$ without using the neighborhood smoothing prior.

Berkeley Segmentation Dataset, $N = 481 \times 321 = 154401$
$K$oriSEEDSreSEEDSreSEEDS*
2000.058450.02920.02495
4000.063450.035250.03545
6000.07340.04010.05625
8000.054150.042450.03305
10400.06250.04940.03465
NYU Depth V2 Dataset, $N = 608 \times 448 = 272384$
$K$oriSEEDSreSEEDSreSEEDS*
4500.105830.07120.058125
7000.113950.096350.0563
8400.11280.0730750.049125
11000.117280.0513750.067825
14000.137370.053950.06495
Table 1: Runtime in seconds of both implementations for the Berkeley Segmentation Dataset and the NYU Depth V2 Dataset. The following parameters were used: $T = 1$ iterations of block updates at each level; $2T = 2$ iterations of pixel updates; $Q = 3^3$.

References

  • [1] M. van den Bergh, X. Boix, G. Roig, B. de Capitani, L. van Gool. SEEDS: Superpixels extracted via energy-driven sampling. European Conference on Computer Vision, pages 13–26, 2012.
  • [2] D. Stutz. Superpixel Segmentation using Depth Information. Bachelor thesis, RWTH Aachen University, Aachen, Germany, 2014.
  • [3] P. Arbeláez, M. Maire, C. Fowlkes, J. Malik. Contour detection and hierarchical image segmentation. Transactions on Pattern Analysis and Machine Intelligence, 33(5):898–916, 2011.
  • [4] N. Silberman, D. Hoiem, P. Kohli, R. Fergus. Indoor segmentation and support inference from RGBD images. European Conference on Computer Vision, pages 746–760, 2012.
What is your opinion on this article? Let me know your thoughts on Twitter @davidstutz92 or LinkedIn in/davidstutz92.