IAM

OPENSOURCEFAN STUDYING
STUDYINGCOMPUTERSCIENCEANDMATH COMPUTERSCIENCE

Check out the latest superpixel benchmark — Superpixel Benchmark (2016) — and let me know your opinion! @david_stutz
05thAPRIL2016

READING

A. Gordo, J. A. Rodríguez-Serrano, F. Perronnin, E. Valveny. Leveraging category-level labels for instance-level image retrieval. In Computer Vision and Pattern Recognition, Conference on, pages 3045–3052, Providence, Rhode Island, June 2012.

Gordo et al. propose an approach to dimensionality reduction of image representations for image retrieval. Given a training set of image representations extracted for image retrieval, they jointy learn a dimensionality reduction $P \in \mathbb{R}^{C' \times C}$ and a set of classifiers $w_t \in \mathbb{R}^{C'}, t \in \{1,\ldots,T\}$, in order to project the image representations and their labels into a common subspace. A large-margin framework is employed; the energy

$E(P) = \sum_{(x_n, t_n, t) : t \neq t_n} \max\{0, 1 - s(x_n, t_n) + s(x_n, t)\}$ with $s(x_n, t) = (P x_n)^T w_t$

is minimized using stochastic gradient descent. Here, $C$ is the dimension of the image representations and $C'$ the target dimensionality. For $n = 1,\ldots,N$, $(x_n, t_n)$ represents a pair of image representation and label from the training set. Then, $s(x_n, t)$ can be interpreted as the relevance of label $t$ to image representation $x_n$. During minimization, a triple $(x_n, t_n, t)$ with $t_n \neq t$ is sampled and the projection matrix $P$ as well as the classifiers $w_{t_n}$ and $w_t$ are updated only if the loss

$\max\{0, 1 - s(x_n,t_n) + s(x_n, t)\}$

is positive. Then, the update equations derived by Gordo et al. are as follows:

$P[\tau + 1] = P[\tau] + \gamma (w_{t_n}[\tau] - w_t[\tau])x_n^T$
$w_{t_n}[\tau + 1] = w_{t_n}[\tau] + \gamma P[\tau]x_n$
$w_t[\tau + 1] = w_t[\tau] - \gamma P[\tau] x_n$

where $\gamma$ is the learning rate and $\tau$ indexes the iterations. Finally, $P$ is used for dimensionality reduction while the classifiers $w_t$ are discarded.

What is your opinion on the summarized work? Or do you know related work that is of interest? Let me know your thoughts in the comments below or get in touch with me: