IAM

OPENSOURCEFAN STUDYING
STUDYINGCOMPUTERSCIENCEANDMATH COMPUTERSCIENCE

Check out the latest superpixel benchmark — Superpixel Benchmark (2016) — and let me know your opinion! @david_stutz
20thJUNE2015

READING

D. Arthur, S. Vassilvitskii. k-means++: The Advantages of Careful Seeding. Proceedings of the ACM-SIAM Symposium on Discrete Algorithms, 2007.

Arthur and Vassilvitskii propose a novel initialization for standard $K$-means. The initialization step is summarized as follows: Given a set of data points $X \subset \mathbb{R}^D$ and a number of clusters $K$:

  1. Randomly choose the first center $\mu_1 \in X$.
  2. For $k = 2,\ldots,K$, choose center $\mu_k$ as $\mu_k = x_n \in X$ with probability proportional to:

$\frac{d(x_n)}{\sum_{x_{n'}} d(x_{n'})}$ with $d(x_n) = \min_{1 \leq k' < k} \{\|x_n - \mu_{k'}\|_2\}$

This type of initialization is commonly implemented in standard $K$-means implementations, for example in OpenCV.

What is your opinion on the summarized work? Or do you know related work that is of interest? Let me know your thoughts in the comments below or using the following platforms: