A. G. Howard. Some Improvements on Deep Convolutional Neural Network Based Image Classification. CoRR, 2013.

Howard discusses several approaches of improving the performance of deep networks on the ImageNet dataset, based on the model of []. These approaches are mostly concerned with data augmentation for training and ensemble prediction for testing:

  • Data augmentation: In addition to the random crops, horizontal flipping and random lighting changes employed in [], Howard uses brightness, contrast and color manipulations (unfortunately, the details have been omitted). Furthermore, instead of cropping to images to the training size, Howard only re-scales the smallest size and then selects random crops withing the remaining image.
  • Testing: Howard averages the predictions of several different inputs. Therefore, $90$ transformations are considered, including crops, translations and scales. A greedy algorithm is used to select a subset of these 90 transformations yielding the best results.
  • [] A. Krizhevsky, I. Sutskever, G. Hinton. Imagenet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems 25, 2012.
What is your opinion on this article? Let me know your thoughts on Twitter @davidstutz92 or LinkedIn in/davidstutz92.