C. Harris, M. Stephens. A combined corner and edge detector. In Alvey Vision Conference, pages 1–6, Manchester, United Kingdom, September 1988.

The "Harris"-detector (note that a more recent description can also be found in [1]) is based on the second moment matrix $A$ of an image $x_n$. The corresponding eigenvalues $\lambda_1, \lambda_2$ represent the signal change in two orthogonal directions and interest points are extracted at pixels where both eigenvalues are large. For efficiency, Harris and Stephens propose to maximize

$\lambda_1\lambda_2 - \kappa(\lambda_1 + \lambda_2)^2 = \text{det}(A) - \kappa \text{trace}(A)^2$ with $A = \begin{pmatrix}\partial_x^2 x_n & \partial_{xy} x_n\\\partial_{xy} x_n & \partial_y^2 x_n\end{pmatrix}$

where $\kappa$ is a sensitivity parameter. In practice the detector is applied in scale space, that is on a set of images

$x_n^{(\sigma_s)} = g_{\sigma_s} \ast x_n$

where $\ast$ denotes convolution and $g_{\sigma_s}$ is a Gaussian kernel with standard deviation $\sigma_s$ with $\sigma_s$ sampled at logarithmic scale. Therefore, it automatically selects the optimal scale and defines the size of the region used for local descriptors. Several extensions - for example the Harris-Laplace Detector - have been proposed, see [1].

  • [1] Krystian Mikolajczyk and Cordelia Schmid. Scale and affine invariant interest point detectors. International Journal of Computer Vision, 60(1):63–86, October 2004.
What is your opinion on this article? Let me know your thoughts on Twitter @davidstutz92 or LinkedIn in/davidstutz92.