Tuesday, January 30, 2018

PyTorch(六):Seminar

PyTorch(六):Seminar

2018/01/25

本文討論網址:
https://www.facebook.com/groups/2027602154187130/permalink/2065069573773721/

十八篇論文一次下載:
https://www.dropbox.com/sh/x91gt8jhc4siid4/AACrGEHVgUlfLqHY1aaIjcuCa?dl=0

前言:

目前「CS+X & PyTorch Taipei 深度學習經典論文研討」活動,報名人數已可支援十八篇論文,會在台大開放旁聽。

-----

Summary:

下方的訊息已經過時,有興趣參加台北、新竹、台中、台南、高雄地區論文研討的朋友,請到 PyTorch Taiwan 的 FB 社團參考公告即可!

-----

本活動是我 [1] 在 PyTorch Taiwan [2] 所辦的網路論文研討(論文請參考最下方 Appendix)。PyTorch Taipei 報名者在社團 [2] 內發表教學文之後,會受邀到台大 Prof. Tsai [3] 的 CS+X 課堂上演講。地點在普通教室 305,可容納一百人,時間在每週四下午七點到九點,於 2018年三月開始。論文主要摘自 [4],可參考 [5] 的部分文章。

PyTorch Taipei 講者報名處 [6]。
PyTorch Hsinchu 講者報名處 [7] 。

詳細內容請參閱 PyTorch Taiwan 後續每次的論文公告與演講公告。更多的論文,請參考 [8]。

-----


Fig. 1. 深度學習: Caffe 之經典模型詳解與實戰 [4]。

-----

舊版論文研討內容

一、Fundamental

◎ CNN(LeNet)+ BP
◎ RNN(LSTM)

二、CNN

◎ AlexNet
◎ ZFNet
◎ NIN
◎ GoogLeNet
◎ VGGNet
◎ SqueezeNet

三、Pre R-CNN

◎ PreVGGNet
◎ SVM
◎ SMO
◎ DPM
◎ SS
◎ FCN

四、R-CNN

◎ R-CNN
◎ SPPNet
◎ Fast R-CNN
◎ Faster R-CNN
◎ YOLO
◎ SSD

-----

新版論文研討內容

聽別人報告,用處不大,除非你已經有一定的基礎。這份清單,適合六到八個人組一個小型讀書會,不對外公開。

1.1 LeNet
1.2 BP
1.3 Python Lab

2.1 AlexNet
2.2 ZFNet
2.3 NIN

3.1 GoogLeNet
3.2 VGGNet
3.3 LSTM

4.1 ResNet
4.2 FCN
4.3 YOLO v1

5.1 Faster R-CNN
5.2 SSD
5.3 YOLO v2

6.1 FPN
6.2 RetinaNet
6.3 YOLO v3

-----

一、Fundamental 
二、CNN
三、Machine Learning 
四、FCN
五、Object Detection
六、Optimization
七、Normalization
八、Regularization
九、Activation Function

-----

References

[1] Marcel @ LinkedIn
https://www.linkedin.com/in/marcel-wang-3a988b7a/

[2] PyTorch Taiwan
https://www.facebook.com/groups/2027602154187130/

[3] NTU CS+X
http://homepage.ntu.edu.tw/~pecutsai/

[4] 書名:深度學習: Caffe 之經典模型詳解與實戰,ISBN:7121301180,作者:樂毅, 王斌,出版社:電子工業出版社,出版日期:2016-12-01.
http://hemingwang.blogspot.tw/2017/03/aiconvolutional-neural-network_23.html

[5] AI從頭學(目錄)
http://hemingwang.blogspot.tw/2016/12/ai_20.html

[6] PyTorch Taipei
http://hemingwang.blogspot.tw/2018/01/pytorchpytorch-taipei_20.html

[7] PyTorch Hsinchu
http://hemingwang.blogspot.tw/2018/01/pytorchpytorch-hsinchu.html

[8] AI從頭學(三四):Complete Works
http://hemingwang.blogspot.tw/2017/08/aicomplete-works.html

-----

Appendix

一、Fundamental

◎ CNN(LeNet)+ BP
LeCun, Yann, et al. "Gradient-based learning applied to document recognition." Proceedings of the IEEE 86.11 (1998): 2278-2324.
http://yann.lecun.com/exdb/publis/pdf/lecun-98.pdf

◎ RNN(LSTM)
Hochreiter, Sepp, and Jürgen Schmidhuber. "Long short-term memory." Neural computation 9.8 (1997): 1735-1780.
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.676.4320&rep=rep1&type=pdf
 
-----

二、CNN

◎ AlexNet
Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems. 2012.
http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf

◎ ZFNet
Zeiler, Matthew D., and Rob Fergus. "Visualizing and understanding convolutional networks." European conference on computer vision. Springer, Cham, 2014.
https://arxiv.org/pdf/1311.2901.pdf 

◎ NIN
Lin, Min, Qiang Chen, and Shuicheng Yan. "Network in network." arXiv preprint arXiv:1312.4400 (2013).
https://arxiv.org/pdf/1312.4400.pdf

◎ GoogLeNet
Szegedy, Christian, et al. "Going deeper with convolutions." Proceedings of the IEEE conference on computer vision and pattern recognition. 2015.
http://openaccess.thecvf.com/content_cvpr_2015/papers/Szegedy_Going_Deeper_With_2015_CVPR_paper.pdf

◎ VGGNet
Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition." arXiv preprint arXiv:1409.1556 (2014).
https://arxiv.org/pdf/1409.1556/

◎ SqueezeNet
Iandola, Forrest N., et al. "SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 0.5 MB model size." arXiv preprint arXiv:1602.07360 (2016).
https://arxiv.org/pdf/1602.07360.pdf

◎ PreVGGNet
Ciresan, Dan C., et al. "Flexible, high performance convolutional neural networks for image classification." IJCAI Proceedings-International Joint Conference on Artificial Intelligence. Vol. 22. No. 1. 2011.
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.481.4406&rep=rep1&type=pdf

// CNN+

◎ Inception v2, Batch Normalization
Ioffe, Sergey, and Christian Szegedy. "Batch normalization: Accelerating deep network training by reducing internal covariate shift." International conference on machine learning. 2015.
http://proceedings.mlr.press/v37/ioffe15.pdf

◎ Highway Networks
Srivastava, Rupesh Kumar, Klaus Greff, and Jürgen Schmidhuber. "Highway networks." arXiv preprint arXiv:1505.00387 (2015).
https://arxiv.org/pdf/1505.00387.pdf
 
◎ ResNet
He, Kaiming, et al. "Deep residual learning for image recognition." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
http://openaccess.thecvf.com/content_cvpr_2016/papers/He_Deep_Residual_Learning_CVPR_2016_paper.pdf

◎ ResNet v2
He, Kaiming, et al. "Identity mappings in deep residual networks." European Conference on Computer Vision. Springer, Cham, 2016.
https://arxiv.org/pdf/1603.05027.pdf 

◎ ResNet Theory
Eldan, Ronen, and Ohad Shamir. "The power of depth for feedforward neural networks." Conference on Learning Theory. 2016.
http://proceedings.mlr.press/v49/eldan16.pdf

◎ DenseNet
Huang, Gao, et al. "Densely connected convolutional networks." Proceedings of the IEEE conference on computer vision and pattern recognition. Vol. 1. No. 2. 2017.
http://openaccess.thecvf.com/content_cvpr_2017/papers/Huang_Densely_Connected_Convolutional_CVPR_2017_paper.pdf

◎ Inception v3
Szegedy, Christian, et al. "Rethinking the inception architecture for computer vision." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016.
https://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Szegedy_Rethinking_the_Inception_CVPR_2016_paper.pdf

◎ Inception v4
Szegedy, Christian, et al. "Inception-v4, inception-resnet and the impact of residual connections on learning." AAAI. Vol. 4. 2017.
http://www.aaai.org/ocs/index.php/AAAI/AAAI17/paper/download/14806/14311

三、Machine Learning
 
◎ SVM
Boser, Bernhard E., Isabelle M. Guyon, and Vladimir N. Vapnik. "A training algorithm for optimal margin classifiers." Proceedings of the fifth annual workshop on Computational learning theory. ACM, 1992.
http://webmail.svms.org/training/BOGV92.pdf

◎ SMO
Platt, John. "Sequential minimal optimization: A fast algorithm for training support vector machines." (1998).
https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/tr-98-14.pdf

四、FCN
 
◎ FCN
Long, Jonathan, Evan Shelhamer, and Trevor Darrell. "Fully convolutional networks for semantic segmentation." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015.
https://www.cv-foundation.org/openaccess/content_cvpr_2015/app/2B_011.pdf

// FCN+

◎   DeepLab v2
Chen, Liang-Chieh, et al. "Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs." arXiv preprint arXiv:1606.00915 (2016).
https://arxiv.org/pdf/1606.00915.pdf 

◎   DeepLab v3
Chen, Liang-Chieh, et al. "Rethinking atrous convolution for semantic image segmentation." arXiv preprint arXiv:1706.05587 (2017).
https://arxiv.org/pdf/1706.05587.pdf

-----

五、Object Detection

◎ DPM
Felzenszwalb, Pedro F., et al. "Object detection with discriminatively trained part-based models." IEEE transactions on pattern analysis and machine intelligence 32.9 (2010): 1627-1645.
http://vc.cs.nthu.edu.tw/home/paper/codfiles/vclab/201402141243/Object-Detection-with-Discriminatively-Trained-Part-Based-Models.pdf

◎ SS
Uijlings, Jasper RR, et al. "Selective search for object recognition." International journal of computer vision 104.2 (2013): 154-171.
https://ivi.fnwi.uva.nl/isis/publications/2013/UijlingsIJCV2013/UijlingsIJCV2013.pdf

◎ R-CNN
Girshick, Ross, et al. "Rich feature hierarchies for accurate object detection and semantic segmentation." Proceedings of the IEEE conference on computer vision and pattern recognition. 2014.
https://www.cv-foundation.org/openaccess/content_cvpr_2014/papers/Girshick_Rich_Feature_Hierarchies_2014_CVPR_paper.pdf?spm=5176.100239.blogcont55892.8.pm8zm1&file=Girshick_Rich_Feature_Hierarchies_2014_CVPR_paper.pdf

◎ SPPNet
He, Kaiming, et al. "Spatial pyramid pooling in deep convolutional networks for visual recognition." european conference on computer vision. Springer, Cham, 2014.
https://arxiv.org/pdf/1406.4729.pdf
 
◎ Fast R-CNN
Girshick, Ross. "Fast R-CNN." Proceedings of the IEEE international conference on computer vision. 2015.
http://openaccess.thecvf.com/content_iccv_2015/papers/Girshick_Fast_R-CNN_ICCV_2015_paper.pdf

◎ Faster R-CNN
Ren, Shaoqing, et al. "Faster R-CNN: Towards real-time object detection with region proposal networks." Advances in neural information processing systems. 2015.
http://papers.nips.cc/paper/5638-faster-r-cnn-towards-real-time-object-detection-with-region-proposal-networks.pdf

◎ YOLO v1
Redmon, Joseph, et al. "You only look once: Unified, real-time object detection." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016.
https://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Redmon_You_Only_Look_CVPR_2016_paper.pdf

◎ SSD
Liu, Wei, et al. "SSD: Single shot multibox detector." European conference on computer vision. Springer, Cham, 2016.
https://arxiv.org/pdf/1512.02325.pdf

◎ YOLO v2
Redmon, Joseph, and Ali Farhadi. "YOLO9000: better, faster, stronger." arXiv preprint (2017).

◎ YOLO v3
YOLOv3: An Incremental Improvement
https://pjreddie.com/media/files/papers/YOLOv3.pdf

◎  R-FCN

◎  Mask R-CNN

◎ PVANet
Kim, Kye-Hyeon, et al. "PVANET: deep but lightweight neural networks for real-time object detection." arXiv preprint arXiv:1608.08021 (2016).
https://arxiv.org/pdf/1608.08021.pdf

-----

六、Optimization

◎ SGD
Bottou, Léon. "Stochastic gradient descent tricks." Neural networks: Tricks of the trade. Springer, Berlin, Heidelberg, 2012. 421-436.
https://www.microsoft.com/en-us/research/wp-content/uploads/2012/01/tricks-2012.pdf 

◎ NAG
Sutskever, Ilya, et al. "On the importance of initialization and momentum in deep learning." International conference on machine learning. 2013.
http://www.cs.toronto.edu/~fritz/absps/momentum.pdf
 
◎ Adagrad
Duchi, John, Elad Hazan, and Yoram Singer. "Adaptive subgradient methods for online learning and stochastic optimization." Journal of Machine Learning Research 12.Jul (2011): 2121-2159.
http://www.jmlr.org/papers/volume12/duchi11a/duchi11a.pdf
  
◎ Adadelta
Zeiler, Matthew D. "ADADELTA: an adaptive learning rate method." arXiv preprint arXiv:1212.5701 (2012).
https://arxiv.org/pdf/1212.5701.pdf

◎ RMSprop
Tieleman, Tijmen, and Geoffrey Hinton. "Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude." COURSERA: Neural networks for machine learning 4.2 (2012): 26-31.
http://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf

◎ Adam
Kingma, Diederik P., and Jimmy Ba. "Adam: A method for stochastic optimization." arXiv preprint arXiv:1412.6980 (2014).
https://arxiv.org/abs/1412.6980

◎ Nadam
Dozat, Timothy. "Incorporating nesterov momentum into adam." (2016).
https://openreview.net/pdf?id=OM0jvwB8jIp57ZJjtNEZ

◎ AMSgrad
Reddi, Sashank J., Satyen Kale, and Sanjiv Kumar. "On the convergence of adam and beyond." International Conference on Learning Representations. 2018.
http://www.satyenkale.com/papers/amsgrad.pdf

-----

七、Normalization

◎ Batch Normalization
Ioffe, Sergey, and Christian Szegedy. "Batch normalization: Accelerating deep network training by reducing internal covariate shift." International conference on machine learning. 2015.
http://proceedings.mlr.press/v37/ioffe15.pdf

◎ Layer Normalization
Ba, Jimmy Lei, Jamie Ryan Kiros, and Geoffrey E. Hinton. "Layer normalization." arXiv preprint arXiv:1607.06450 (2016).
https://arxiv.org/pdf/1607.06450.pdf

◎ Weight Normalization
Salimans, Tim, and Diederik P. Kingma. "Weight normalization: A simple reparameterization to accelerate training of deep neural networks." Advances in Neural Information Processing Systems. 2016.
http://papers.nips.cc/paper/6114-weight-normalization-a-simple-reparameterization-to-accelerate-training-of-deep-neural-networks.pdf

◎ Group Normalization
Wu, Yuxin, and Kaiming He. "Group Normalization." arXiv preprint arXiv:1803.08494 (2018).
https://arxiv.org/pdf/1803.08494.pdf

◎ Self-Normalizing Neural Networks
Klambauer, Günter, et al. "Self-normalizing neural networks." Advances in Neural Information Processing Systems. 2017.
http://papers.nips.cc/paper/6698-self-normalizing-neural-networks.pdf 

◎ Overparamerization
Arora, Sanjeev, Nadav Cohen, and Elad Hazan. "On the Optimization of Deep Networks: Implicit Acceleration by Overparameterization." arXiv preprint arXiv:1802.06509 (2018).
https://arxiv.org/pdf/1802.06509.pdf

-----

八、Regularization

Weight Decay
Dropout
Drop connect
Concrete Dropout

-----

九、Activation Function

◎  Overview
Ramachandran, Prajit, Barret Zoph, and Quoc V. Le. "Searching for activation functions." (2018).
https://openreview.net/pdf?id=Hkuq2EkPf
https://arxiv.org/pdf/1710.05941.pdf

No comments: