DAVIDSTUTZ

Bachelor Thesis Proposal “Superpixel Segmentation using Depth Information”

Due to my bachelor thesis at RWTH Aachen University I am currently busy learning everything about superpixel segmentation — the oversegmentation of an image into groups of pixels using low-level features. In this article I want to give a short introduction by presenting my bachelor thesis proposal.

Update. LaTeX source of the bachelor thesis and the slides, as well as the code for evaluation is now available on GitHub.

Update. Find my bachelor thesis here: Bachelor Thesis "Superpixel Segmentation Using Depth Information".

This semester, I began to write my bachelor thesis in the area of computer vision about superpixel segmentation. The thesis is entitled

"Superpixel Segmentation using Depth Information"

Here, I want to give a short introduction in the form of my bachelor thesis proposal.

Proposal

The term superpixel was introduced in [1] and is used to describe the oversegmentation of an image into groups of pixels using low-level features [13]. Superpixels are often used as a pre-processing step for higher level algorithms, reducing the computations from millions of pixels to thousands of superpixels. Despite the extensive pool of superpixel algorithms available [1][2][3][4][5][6][7][9][10][12][13][16][17], it seems difficult to meet both speed requirements as well as quality requirements (superpixels should not overlap with more than one object each [18], that is the superpixels should not violate object boundaries). Unfortunately, only little research is devoted to comparing and analysing the different superpixel algorithms [11][14][18]. With the availability of consumer grade RGB-D sensors such as the Microsoft Kinect, several approaches have utilized depth information to improve superpixel quality [13][16], however, none of these are fast enough to be used in a real-time scenario.

In this thesis we focus on the approach proposed in [5], called SEEDS. Starting from a regular grid as initial superpixel segmentation, the approach is based on an energy composed of a boundary term and a color distribution term which is optimized using hill climbing by randomly exchanging pixels or blocks of pixels between neighboring superpixels [5]. For each superpixel, the color distribution term enforces homogeneity in color based on color histograms, whereas the boundary term favors a smooth shape using superpixel histograms (placing a $N \times N$ window at pixel $(i,j)$, the superpixel histogram counts the number of pixels within this window belonging to superpixel $k$ for each $k = 1,\ldots,K$, where $K$ is the total number of superpixels). SEEDS is reported to give good performance in both quality and speed. The iterative nature of SEEDS makes it possible to obtain a valid superpixel segmentation of the image at any time, allowing the user to trade off quality for speed. To our knowledge no other CPU based approach obtains superpixel segmentations at a framerate of 30Hz, making SEEDS a particularly interesting approach.

Nevertheless, there is still room for improvements. Although the boundary term favors smooth shapes, the superpixels seem to have highly varying forms when applied to slightly different data than used in [5]. However, the compactness of superpixels can be an important requirement in certain applications [14]. In addition, the boundary term does not introduce penalties for crossing strong gradients, preventing superpixels from violating object boundaries. Further, the corresponding paper does not clearly describe how the randomized exchange of pixels or blocks of pixels is defined and how the randomness affects the resulting superpixel segmentation.

SEEDS will be reimplemented utilizing OpenCV (an open source computer vision library: http://opencv.org/). A major goal of this thesis is the integration of depth information into SEEDS, hopefully improving the quality, while retaining the speed. In order to improve the superpixel quality, we focus on both the color distribution term and the boundary term as well as grid initialization and block updates to tackle the flaws mentioned above. Our first implementation will be evaluated without depth information on the BSDS500 Dataset [8], and compared to several other superpixel approaches using the usual metrics as described in detail in [11] as well as in [14]. Improvements based on depth information will be evaluated on the NYU Depth Dataset [19] and compared to approaches also utilizing depth information [13][16].

References

• [1] X. Ren, J. Malik. Learning a classi cation model for segmentation. Proceedings of the International Conference on Computer Vision, pages 10 - 17, 2000.
• [2] R. Achanta, A. Shaji, K. Smith, A. Lucchi, P. Fua, S. Susstrunk. SLIC Superpixels. Technical report, EPFL, Lausanne, 2010.
• [3] P. F. Felzenswalb, D. P. Huttenlocher. Efficient graph-based image segmentation. International Journal of Computer Vision, volume 59, number 2, 2004.
• [4] C. Conrad, M. Mertz, R. Mester. Contour-Relaxed Superpixels. Energy Minimization Methods in Computer Vision and Pattern Recognition, volume 8081 of Lecture Notes in Computer Science, pages 280 - 293. Springer Berlin Heidelberg, 2013.
• [5] M. Van den Bergh, X. Boix, G. Roig, B. de Capitani, L. van Gool. SEEDS - Superpixels Extracted via Energy-Driven Sampling. European Conference on Computer Vision, pages 13- 26, 2012.
• [6] A. Levinshtein, A. Stere, K. N. Kutulakos, D. J. Fleet, S. J. Dickinson, K. Siddiqi. Turbopixels: Fast Superpixels Using Geometric Flows. IEEE Transactions on Pattern Analysis and Machine Intelligence, pages 2290 - 2297, 2009.
• [7] M.-Y. Lui, O. Tuzel, S. Ramalingam, and R. Chellappa. Entropy Rate Superpixel Segmentation. Conference on Computer Vision and Pattern Recognition, pages 2097 - 2104, 2011.
• [8] D. Martin, C. Fowlkes, D. Tal, J. Malik. A Database of Human Segmented Natural Images and Its Application to Evaluating Segmentation Algorithms and Measuring Ecological Statistics. International Conference on Computer Vision, pages 416 - 423, 2001.
• [9] A. P. Moore, J. D. Prince, J. Warrell. Lattice Cut - Constructing Superpixels Using Layer Constraints. Conference on Computer Vision and Pattern Recognition, pages 1 - 8, 2008.
• [10] A. P. Moore, J. D. Prince, J. Warrell, U. Mohammed, G. Jones. Superpixel Lattices. Conference on Computer Vision and Pattern Recognition, pages 1 - 8, 2008.
• [11] P. Neubert, P. Protzel. Superpixel Benchmark and Comparison. Forum Bildverarbeitung, 2012.
• [12] P. Mehrani, O. Veksler, Y. Boykov. Superpixels and Supervoxels in an Energy Optimization Framework. European Conference on Computer Vision, pages 211 - 224, 2010.
• [13] J. Papon, A. Abramov, M. Schoeler, F. Wörgötter. Voxel Cloud Connectivity Segmentation - Supervoxels for Point Clouds. Conference on Computer Vision and Pattern Recognition, pages 2027 - 2034, 2013.
• [14] A. Schick, M. Fischer, R. Steifelhagen. Measuring and Evaluating the Compactness of Superpixels. International Conference in Pattern Recognition, pages 930 - 934, 2012.
• [15] J. Shi, J. Malik. Normalized Cuts and Image Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, pages 888 - 905, 2000.
• [16] D. Weikersdorfer, D. Gossow, M. Beetz. Depth-Adaptive Superpixels. International Conference on Pattern Recognition, pages 2087 - 2090, 2012.
• [17] Y. Zhang, R. hartley, J. Mashford, S. Burn. Superpixels via Pseudo-Boolean Optimization. International Conference on Computer Vision, pages 1387 - 1394, 2011.
• [18] R. Achanta, A. Shaji, K. Smith, A. Lucchi, P. Fua, S. Süsstrunk. SLIC Superpixels Compared to State-Of-The-Art Superpixel Methods. IEEE Transactions on Pattern Analysis and Machine Intelligence, pages 2274 - 2282, 2012.
• [19] N. Silberman, D. Hoiem, P. Kohli, R. Fergus. Indoor Segmentation and Support Inference from RGBD Images. ECCV, 2012.