NAG
2019/12/05
-----
// Overview of different Optimizers for neural networks
-----
// An Overview on Optimization Algorithms in Deep Learning 1 - Taihong Xiao
-----
# RMSProp
-----
// Comparison of Optimizers in Neural Networks - Fishpond
-----
References
# NAG
Nesterov, Y. "A method of solving a convex programming problem with convergence rate $$ O (\frac {1}{k^ 2}) $$ O (1k2)." Soviet Math. Dokl. Vol. 27.
http://mpawankumar.info/teaching/cdt-big-data/nesterov83.pdf # RMSProp
Tieleman, Tijmen, and Geoffrey Hinton. "Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude." COURSERA: Neural networks for machine learning 4.2 (2012): 26-31.
http://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf -----
Overview of different Optimizers for neural networks
https://medium.com/datadriveninvestor/overview-of-different-optimizers-for-neural-networks-e0ed119440c3
An Overview on Optimization Algorithms in Deep Learning 1 - Taihong Xiao
https://prinsphield.github.io/posts/2016/02/overview_opt_alg_deep_learning1/
Comparison of Optimizers in Neural Networks - Fishpond
https://tiddler.github.io/optimizers/
-----
比Momentum更快:揭开Nesterov Accelerated Gradient的真面目 - 知乎
https://zhuanlan.zhihu.com/p/22810533
No comments:
Post a Comment