O. Chum, J. Philbin, J. Sivic, M. Isard, A. Zisserman. Total recall: Automatic query expansion with a generative feature model for object retrieval. In Computer Vision, International Conference on, pages 1–8, Rio de Janeiro, Brazil, October 2007.

Chum et al. bring query expansion, as previously known from text retrieval, to the visual domain. The idea of query expansion is to improve query results by re-issuing a number of highly ranked results as new query to include further relevant results. Average Query Expansion uses the top $K^\ast$ of $K$ retrieved images and averages the corresponding term-frequency (i.e. Chum et al. use the Bag of Visual Words model with term-frequency weighting [1]) representation:

$z_\text{avg} = \frac{1}{K^\ast + 1} \left(z_0 + \sum_{k = 1}^{K^\ast} z_k\right)$

where $z_0$ is the query image and $z_1, \ldots, z_K$ are the retrieved images. The results of query $z_\text{avg}$ is appended to the first $K^\ast$ results of the original query. Usually, this type of query expansion is used in combination with spatial verification such that only verified results are included in the query expansion.

  • [1] Josef Sivic and Andrew Zisserman. Video google: A text retrieval approach to object matching in videos. In Computer Vision, International Conference on, pages 1470–1477, Nice, France, October 2003.
What is your opinion on this article? Let me know your thoughts on Twitter @davidstutz92 or LinkedIn in/davidstutz92.