(Convolutional) Neural Network Tutorials

MISC

Outline
Introduction to Neural Networks
Understanding Convolutional Neural Networks
Advanced Topic: Neural Codes for Image Retrieval

This page presents semnar reports that are intended as tutorials for neural networks — starting with multi-layer perceptrons —, convolutional neural networks and using them for image retrieval. These papers where written as part of seminars organized by the chair of computer science i6 and the computer vision group at RWTH Aachen University.

Introduction to Neural Networks

This report gives a succinct but detailed introduction to neural networks, coverage multi-layer perceptrons, backpropagation as well as optimization techniques with a demonstration on MNIST.

Abstract: In this seminar paper we study artificial neural networks, their training and application to pattern recognition. We start by giving a general definition of artificial neural networks and introduce both the single-layer and the multilayer perceptron. After considering several activation functions we discuss network topology and the expressive power of multilayer perceptrons. The second section introduces supervised network training. Therefore, we discuss gradient descent and Newton's method for parameter optimization. We derive the error backpropagation algorithm for evaluating the gradient of the error function and extend this approach to evaluate its hessian. In addition, the concept of regularization will be introduced. The third section introduces pattern classification. Using maximum likelihood estimation we derive the cross-entropy error function. As application, we train a two-layer perceptron to recognize handwritten digits based on the MNIST dataset.

Update. In addition to the seminar paper, MatLab code for a simple two-layer perceptron can be found on GitHub. A short introduction can also be found in my article "Recognizing Handwritten Digits using a Two-Layer Perceptron and the MNIST Dataset".

Find a table of contents in this article.

Seminar Paper Presentation Slides

Understanding Convolutional Neural Networks

After a general introduction to neural networks and their training, this report gives a introduction to convolutional neural networks:

Abstract: This seminar paper focusses on convolutional neural networks and a visualization technique allowing further insights into their internal operation. After giving a brief introduction to neural networks and the multilayer perceptron, we review both supervised and unsupervised training of neural networks in detail. In addition, we discuss several approaches to regularization. The second section introduces the different types of layers present in recent convolutional neural networks. Based on these basic building blocks, we discuss the architecture of the traditional convolutional neural network (LeNet) as well as the architecture of recent implementations. The third section focusses on a technique to visualize feature activations of higher layers by backprojecting them to the image plane. This allows to get deeper insights into the internal working of convolutional neural networks such that recent architectures can be evaluated and improved even further.

Find a table of contents in this article.

Seminar Paper Presentation Slides

Understanding Convolutional Neural Networks

Finally, as an advanced topic, this report discuss the use of convolutional neural network architectures such as AlexNet for image retrieval.

Abstract: This seminar report focuses on using convolutional neural networks for image retrieval. Firstly, we give a thorough discussion of several state-of-the-art techniques in image retrieval by considering the associated subproblems: image description, descriptor compression, nearest-neighbor search and query expansion. We discuss both the aggregation of local descriptors using clustering and metric learning techniques as well as global descriptors. Subsequently, we briefly introduce the basic concepts of deep convolutional neural networks, focusing on the AlexNet architecture. We discuss different types of layers commonly used in recent architectures, for example convolutional layers, non-linearity and rectification layers, pooling layers as well as local contrast normalization layers. Finally, we shortly review supervised training techniques based on stochastic gradient descent and regularization techniques such as dropout and weight decay. Finally, we discuss the use of feature activations in intermediate layers as image representation for image retrieval. After presenting experiments and comparing convolutional neural networks for image retrieval with other state-of-the-art techniques, we conclude by motivating the combined use of deep architectures and hand-crafted image representations for accurate and efficient image retrieval.

Find a table of contents in this article.

Seminar Paper Presentation Slides

IAM

DAVIDSTUTZ

MISC