2019/12/09
-----
http://speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2016/Lecture/DNN%20tip.pdf
-----
-----
References
# Weight Decay
Zhang, Guodong, et al. "Three mechanisms of weight decay regularization." arXiv preprint arXiv:1810.12281 (2018).
https://arxiv.org/pdf/1810.12281.pdf // WD 1989
Hanson, Stephen José, and Lorien Y. Pratt. "Comparing biases for minimal network construction with back-propagation." Advances in neural information processing systems. 1989.
http://papers.nips.cc/paper/156-comparing-biases-for-minimal-network-construction-with-back-propagation.pdf // WD 1992
Krogh, Anders, and John A. Hertz. "A simple weight decay can improve generalization." Advances in neural information processing systems. 1992.
http://papers.nips.cc/paper/563-a-simple-weight-decay-can-improve-generalization.pdf # AdamW
Loshchilov, Ilya, and Frank Hutter. "Decoupled weight decay regularization (2019)." arXiv preprint arXiv:1711.05101.
-----
在神经网络中weight decay起到的做用是什么?momentum呢?normalization呢? - 知乎
https://www.zhihu.com/question/24529483
-----
DNN tip
http://speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2016/Lecture/DNN%20tip.pdf
The Star Also Rises: AI 從頭學(三七):Weight Decay
-----
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.