Wednesday, September 16, 2020

Weight Decay

Weight Decay

2019/12/09

-----


http://speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2016/Lecture/DNN%20tip.pdf

-----



-----

References

# Weight Decay
Zhang, Guodong, et al. "Three mechanisms of weight decay regularization." arXiv preprint arXiv:1810.12281 (2018).
https://arxiv.org/pdf/1810.12281.pdf

// WD 1989
Hanson, Stephen José, and Lorien Y. Pratt. "Comparing biases for minimal network construction with back-propagation." Advances in neural information processing systems. 1989.
http://papers.nips.cc/paper/156-comparing-biases-for-minimal-network-construction-with-back-propagation.pdf

// WD 1992
Krogh, Anders, and John A. Hertz. "A simple weight decay can improve generalization." Advances in neural information processing systems. 1992.
http://papers.nips.cc/paper/563-a-simple-weight-decay-can-improve-generalization.pdf 

# AdamW

Loshchilov, Ilya, and Frank Hutter. "Decoupled weight decay regularization (2019)." arXiv preprint arXiv:1711.05101.
https://arxiv.org/pdf/1711.05101.pdf

-----

在神经网络中weight decay起到的做用是什么?momentum呢?normalization呢? - 知乎
https://www.zhihu.com/question/24529483 

-----

DNN tip
http://speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2016/Lecture/DNN%20tip.pdf 

The Star Also Rises: AI 從頭學(三七):Weight Decay

-----

No comments: