Y. Ganin, V. S. Lempitsky. N^4-Fields: Neural Network Nearest Neighbor Fields for Image Transforms. Computing Research Repository, 2014.

Ganin and Lempitsky propose to use a combination of convolutional neural networks and $K$-nereast-neighbor for edge detection. Their implementation is based on the implementation and architecture by Krizhevsky et al. [1], see here, however code is not publicly available. Similar to [2], the convolutional neural network is trained on patches of fixed size. However, as the desired output annotation may have high dimension, principal component analysis (e.g. see [3]) is employed to reduce the target dimensionality. The network is then trained on these new target outputs. Finally, $1$-nearest-neighbor on a subset of the training set is used to annotate new test samples. In practice, Ganin and Lempitsky use a committee of these so called $N^4$-fields (convolutional neural network + $1$-nearest-neighbor) and average across predictions.

  • [1] A. Krizhevsky, I. Sutskever, G. E. Hinton. ImageNet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems, 25, pages 1097 – 1105, 2012.
  • [2] P. Dollár, C. Zitnick. Structured Forests for Fast Edge Detection. International Conference on Computer Vision, 2013.
  • [3] C. Bishop. Pattern Recognition and Machine Learning. Springer Verlag, New York, 2006.

What is your opinion on the summarized work? Or do you know related work that is of interest? Let me know your thoughts in the comments below: