Thursday, December 05, 2019

NAG

NAG

2019/12/05

-----


// Overview of different Optimizers for neural networks

-----


// An Overview on Optimization Algorithms in Deep Learning 1 - Taihong Xiao

-----


# RMSProp

-----


// Comparison of Optimizers in Neural Networks - Fishpond

-----

References

# NAG
Nesterov, Y. "A method of solving a convex programming problem with convergence rate $$ O (\frac {1}{k^ 2}) $$ O (1k2)." Soviet Math. Dokl. Vol. 27.
http://mpawankumar.info/teaching/cdt-big-data/nesterov83.pdf 

# RMSProp
Tieleman, Tijmen, and Geoffrey Hinton. "Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude." COURSERA: Neural networks for machine learning 4.2 (2012): 26-31.
http://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf 

-----

Overview of different Optimizers for neural networks
https://medium.com/datadriveninvestor/overview-of-different-optimizers-for-neural-networks-e0ed119440c3

An Overview on Optimization Algorithms in Deep Learning 1 - Taihong Xiao
https://prinsphield.github.io/posts/2016/02/overview_opt_alg_deep_learning1/

Comparison of Optimizers in Neural Networks - Fishpond
https://tiddler.github.io/optimizers/

-----

比Momentum更快:揭开Nesterov Accelerated Gradient的真面目 - 知乎
https://zhuanlan.zhihu.com/p/22810533

No comments: