IAM

OPENSOURCEFAN STUDYING
STUDYINGCOMPUTERSCIENCEANDMATH COMPUTERSCIENCE

DAVIDSTUTZ

Check out the latest superpixel benchmark — Superpixel Benchmark (2016) — and let me know your opinion! @david_stutz
10thAPRIL2017

READING

R. Pascanu, T. Mikolov, Y. Bengio. On the difficulty of training recurrent neural networks. ICML, 2013.

Pascanu et al. discuss the problems of exploding and vanishing gradients for recurrent neural networks. While these problems where first and foremost discussed in the context of recurrent neural networks and backpropagation through time, the presented solutions, e.g. gradient clipping, are applicable to general (convolutional) neural networks as well (e.g. [], []). Based on the assumption of a first-order method for optimization they also give sufficient and necessary conditions for the two problems based ont he eigenvalues of the involved weight matrices. Without dsicussing their regularization approach to gradient vanishing, gradient explosion is mitigated using gradient clipping. This means that gradients are clipped when exceeding a pre-defined threshold. See the paper for details.

  • [] I. J. Goodfellow, Y. Bengio, A. C. Courville. Deep Learning. MIT Press, 2016.
  • [] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. E. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, A. Rabinovich. Going Deeper with Convolutions. CoRR, 2014.

What is your opinion on the summarized work? Or do you know related work that is of interest? Let me know your thoughts in the comments below or using the following platforms: