Monday, October 19, 2020

Optimization

Optimization

2020/10/12

-----


https://pixabay.com/zh/photos/stopwatch-gears-work-working-time-3699314/

-----


https://www.neuraldesigner.com/blog/5_algorithms_to_train_a_neural_network

-----



https://blog.slinuxer.com/2016/09/sgd-comparison


Fig. Optimization。

-----



-----




-----





-----



-----




-----


http://www.stat.cmu.edu/~ryantibs/convexopt-F18/lectures/quasi-newton.pdf

-----


https://en.wikipedia.org/wiki/Quasi-Newton_method

-----


https://zh.wikipedia.org/wiki/%E8%8E%B1%E6%96%87%E8%B4%9D%E6%A0%BC%EF%BC%8D%E9%A9%AC%E5%A4%B8%E7%89%B9%E6%96%B9%E6%B3%95

-----





-----

References

◎ 大框架

5 algorithms to train a neural network

https://www.neuraldesigner.com/blog/5_algorithms_to_train_a_neural_network


◎ 一、SGD

SGD算法比较 – Slinuxer

https://blog.slinuxer.com/2016/09/sgd-comparison


An overview of gradient descent optimization algorithms

https://ruder.io/optimizing-gradient-descent/


从 SGD 到 Adam —— 深度学习优化算法概览(一) - 知乎

https://zhuanlan.zhihu.com/p/32626442 


◎ 二、牛頓法與高斯牛頓法

(57) Gauss-Newton algorithm for solving non linear least squares explained - YouTube

https://www.youtube.com/watch?v=CjrRFbQwKLA

4.3 Newton's Method

https://jermwatt.github.io/machine_learning_refined/notes/4_Second_order_methods/4_4_Newtons.html

Hessian Matrix vs. Gauss-Newton Hessian Matrix | Semantic Scholar

https://www.semanticscholar.org/paper/Hessian-Matrix-vs.-Gauss-Newton-Hessian-Matrix-Chen/a8921166af9d21cdb8886ddb9a80c703abe3dde5

牛顿法 高斯牛顿法 | Cheng Wei's Blog

https://scm_mos.gitlab.io/algorithm/newton-and-gauss-newton/

◎ 三、共軛梯度法

Deep Learning Book

https://www.deeplearningbook.org/contents/optimization.html

Blog - Conjugate Gradient 1 | Pattarawat Chormai

https://pat.chormai.org/blog/2020-conjugate-gradient-1

linear algebra - Why is the conjugate direction better than the negative of gradient, when minimizing a function - Mathematics Stack Exchange

https://math.stackexchange.com/questions/1020008/why-is-the-conjugate-direction-better-than-the-negative-of-gradient-when-minimi

◎ 四、擬牛頓法

quasi-newton.pdf

http://www.stat.cmu.edu/~ryantibs/convexopt-F18/lectures/quasi-newton.pdf

Quasi-Newton method - Wikipedia

https://en.wikipedia.org/wiki/Quasi-Newton_method

# 很強的架構

梯度下降法、牛顿法和拟牛顿法 - 知乎

https://zhuanlan.zhihu.com/p/37524275

◎ 五、萊文貝格-馬夸特方法

Optimization for Least Square Problems

https://zlthinker.github.io/optimization-for-least-square-problem

萊文貝格-馬夸特方法 - 維基百科,自由的百科全書

https://zh.wikipedia.org/wiki/%E8%8E%B1%E6%96%87%E8%B4%9D%E6%A0%BC%EF%BC%8D%E9%A9%AC%E5%A4%B8%E7%89%B9%E6%96%B9%E6%B3%95

◎ 六、自然梯度法


◎ 七、K-FAC


◎ 八、Shampoo

-----

No comments: