Thursday, December 05, 2019

SGD

SGD

2019/12/02

-----


// Overview of different Optimizers for neural networks

-----


// An Overview on Optimization Algorithms in Deep Learning 1 - Taihong Xiao

-----


# Nadam

-----


// Stochastic Gradient Descent - Deep Learning#g - Medium

-----

References

# SGD
Bottou, Léon. "Stochastic gradient descent tricks." Neural networks: Tricks of the trade. Springer, Berlin, Heidelberg, 2012. 421-436.
https://www.microsoft.com/en-us/research/wp-content/uploads/2012/01/tricks-2012.pdf 

# Nadam
Dozat, Timothy. "Incorporating nesterov momentum into adam." (2016).
https://openreview.net/pdf?id=OM0jvwB8jIp57ZJjtNEZ 

-----

Overview of different Optimizers for neural networks
https://medium.com/datadriveninvestor/overview-of-different-optimizers-for-neural-networks-e0ed119440c3

An Overview on Optimization Algorithms in Deep Learning 1 - Taihong Xiao
https://prinsphield.github.io/posts/2016/02/overview_opt_alg_deep_learning1/

Stochastic Gradient Descent - Deep Learning#g - Medium
https://medium.com/deep-learning-g/stochastic-gradient-descent-63a155ba3975

-----

SGD(1) — for non-convex functions – Ang's learning notes
https://angnotes.wordpress.com/2018/08/19/sgd1-for-non-convex-functions/

No comments: