Thursday, December 05, 2019

Deep Learning Highlight

Deep Learning Highlight

2019/04/25

說明:

這是依照我自學深度學習進度推出的入門建議。

分別有:三篇快速版,可以「快速」一窺深度學習全貌。二十篇慢速版,以電腦視覺為主。三十篇基礎版,新增基礎主題與自然語言處理。十篇精華版,為三十篇基礎版的精華。五十篇完整版,涵蓋上述所有提到的論文。

-----


Fig. 1. 深度學習: Caffe 之經典模型詳解與實戰 [1]。

-----

三篇快速版

1. Deep Learning
2. LeNet
3. LSTM

-----

二十篇慢速版

1. LeNet
2. LSTM

3. AlexNet
4. ZFNet
5. NIN
6. GoogLeNet
7. VGGNet
8. SqueezeNet

9. PreVGGNet
10. SVM
11. SMO
12. DPM
13. SS
14. FCN

15. R-CNN
16. SPPNet
17. Fast R-CNN
18. Faster R-CNN
19. YOLO
20. SSD

-----

三十篇基礎版

1. LeNet(AlexNet、Dropout)
2. NIN(VGGNet、Weight Decay、Momentum)
3. ResNet(Batch Normalization)
4. FCN(Mask R-CNN)
5. YOLOv1(Faster R-CNN、SSD、YOLOv2、FPN、RetinaNet、YOLOv3)

6. LSTM(NNLM、Word2vec)
7. Seq2seq
8. Attention(Layer Normalization)
9. ConvS2S(Adam)
10. Transformer(ELMo、GPT、BERT) 

-----

十篇精華版

1. LeNet
2. NIN
3. ResNet
4. FCN
5. YOLOv1

6. LSTM
7. Seq2seq
8. Attention
9. ConvS2S
10. Transformer

-----

五十篇完整版

CNN 9 (LeNet、AlexNet、ZFNet、NIN、GoogLeNet、VGGNet、PreVGGNet、Highway、ResNet)

Semantic Segmentation 4(FCN、U-Net、Deeplabv3+)、(Mask R-CNN)

Object Detection 14(DPM、SS、R-CNN、SPPNet、Fast R-CNN、Faster R-CNN)、(YOLOv1)、(SSD、R-FCN、YOLOv2、FPN、RetinaNet、YOLOv3、M2Det)

Optimization 6(SGD、Momentum、NAG、AdaGrad、AdaDelta、RMSProp、Adam)

Regularization 2(Weight Decay、Dropout)

Normalization 5(Batch、Weight、Layer、Instance、Group)

NLP 10(LSTM、NNLM、Word2vec、Seq2seq、Attention、ConvS2S、Transformer、ELMo、GPT、BERT)

-----

符號說明:

# basic
// advanced

-----

Paper

# Deep Learning
LeCun, Yann, Yoshua Bengio, and Geoffrey Hinton. "Deep learning." nature 521.7553 (2015): 436.
https://creativecoding.soe.ucsc.edu/courses/cs523/slides/week3/DeepLearning_LeCun.pdf

// GPU
Raina, Rajat, Anand Madhavan, and Andrew Y. Ng. "Large-scale deep unsupervised learning using graphics processors." Proceedings of the 26th annual international conference on machine learning. ACM, 2009.
http://robotics.stanford.edu/~ang/papers/icml09-LargeScaleUnsupervisedDeepLearningGPU.pdf

// Difficult 1994
Bengio, Yoshua, Patrice Simard, and Paolo Frasconi. "Learning long-term dependencies with gradient descent is difficult." IEEE transactions on neural networks 5.2 (1994): 157-166.
https://pdfs.semanticscholar.org/d0be/39ee052d246ae99c082a565aba25b811be2d.pdf

// Difficult 2010
Glorot, Xavier, and Yoshua Bengio. "Understanding the difficulty of training deep feedforward neural networks." Proceedings of the thirteenth international conference on artificial intelligence and statistics. 2010.
http://proceedings.mlr.press/v9/glorot10a/glorot10a.pdf 

// Difficult 2013
Pascanu, Razvan, Tomas Mikolov, and Yoshua Bengio. "On the difficulty of training recurrent neural networks." International conference on machine learning. 2013.
http://proceedings.mlr.press/v28/pascanu13.pdf
 
-----

Part I:Computer Vision

-----

◎ Image Classicification

-----

# LeNet
LeCun, Yann, et al. "Gradient-based learning applied to document recognition." Proceedings of the IEEE 86.11 (1998): 2278-2324.
http://yann.lecun.com/exdb/publis/pdf/lecun-01a.pdf

# AlexNet
Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems. 2012.
http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf

# ZFNet
Zeiler, Matthew D., and Rob Fergus. "Visualizing and understanding convolutional networks." European conference on computer vision. Springer, Cham, 2014.
https://arxiv.org/pdf/1311.2901.pdf

-----

# NIN
Lin, Min, Qiang Chen, and Shuicheng Yan. "Network in network." arXiv preprint arXiv:1312.4400 (2013).
https://arxiv.org/pdf/1312.4400.pdf

# SENet
Hu, Jie, Li Shen, and Gang Sun. "Squeeze-and-excitation networks." Proceedings of the IEEE conference on computer vision and pattern recognition. 2018.
http://openaccess.thecvf.com/content_cvpr_2018/papers/Hu_Squeeze-and-Excitation_Networks_CVPR_2018_paper.pdf

# GoogLeNet
Szegedy, Christian, et al. "Going deeper with convolutions." Proceedings of the IEEE conference on computer vision and pattern recognition. 2015.
http://openaccess.thecvf.com/content_cvpr_2015/papers/Szegedy_Going_Deeper_With_2015_CVPR_paper.pdf

# VGGNet
Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional networks for large-scale image recognition." arXiv preprint arXiv:1409.1556 (2014).
https://arxiv.org/pdf/1409.1556/

# PreVGGNet
Ciresan, Dan C., et al. "Flexible, high performance convolutional neural networks for image classification." IJCAI Proceedings-International Joint Conference on Artificial Intelligence. Vol. 22. No. 1. 2011.
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.481.4406&rep=rep1&type=pdf
  
# Highway v1
Srivastava, Rupesh Kumar, Klaus Greff, and Jürgen Schmidhuber. "Highway networks." arXiv preprint arXiv:1505.00387 (2015).
https://arxiv.org/pdf/1505.00387.pdf

# Highway v2
Srivastava, Rupesh K., Klaus Greff, and Jürgen Schmidhuber. "Training very deep networks." Advances in neural information processing systems. 2015.
https://papers.nips.cc/paper/5850-training-very-deep-networks.pdf

-----
 
# ResNet
He, Kaiming, et al. "Deep residual learning for image recognition." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
http://openaccess.thecvf.com/content_cvpr_2016/papers/He_Deep_Residual_Learning_CVPR_2016_paper.pdf

# ResNet v2
He, Kaiming, et al. "Identity mappings in deep residual networks." European Conference on Computer Vision. Springer, Cham, 2016.
https://arxiv.org/pdf/1603.05027.pdf  

# ResNet-D
Huang, Gao, et al. "Deep networks with stochastic depth." European conference on computer vision. Springer, Cham, 2016.
https://arxiv.org/pdf/1603.09382.pdf

# ResNet-E
Veit, Andreas, Michael J. Wilber, and Serge Belongie. "Residual networks behave like ensembles of relatively shallow networks." Advances in neural information processing systems. 2016.
https://papers.nips.cc/paper/6556-residual-networks-behave-like-ensembles-of-relatively-shallow-networks.pdf

# ResNet-S
Orhan, A. Emin, and Xaq Pitkow. "Skip connections eliminate singularities." arXiv preprint arXiv:1701.09175 (2017).
https://arxiv.org/pdf/1701.09175.pdf

# WRN
Zagoruyko, Sergey, and Nikos Komodakis. "Wide residual networks." arXiv preprint arXiv:1605.07146 (2016).
https://arxiv.org/pdf/1605.07146.pdf

# ResNeXt
Xie, Saining, et al. "Aggregated residual transformations for deep neural networks." Proceedings of the IEEE conference on computer vision and pattern recognition. 2017.
http://openaccess.thecvf.com/content_cvpr_2017/papers/Xie_Aggregated_Residual_Transformations_CVPR_2017_paper.pdf 

# DenseNet
Huang, Gao, et al. "Densely connected convolutional networks." Proceedings of the IEEE conference on computer vision and pattern recognition. Vol. 1. No. 2. 2017.
http://openaccess.thecvf.com/content_cvpr_2017/papers/Huang_Densely_Connected_Convolutional_CVPR_2017_paper.pdf

# DPN
Chen, Yunpeng, et al. "Dual path networks." Advances in Neural Information Processing Systems. 2017.
https://papers.nips.cc/paper/7033-dual-path-networks.pdf

# DLA
Yu, Fisher, et al. "Deep layer aggregation." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018.
http://openaccess.thecvf.com/content_cvpr_2018/papers/Yu_Deep_Layer_Aggregation_CVPR_2018_paper.pdf

# Res2Net
Gao, Shang-Hua, et al. "Res2Net: A New Multi-scale Backbone Architecture." arXiv preprint arXiv:1904.01169 (2019).
https://arxiv.org/pdf/1904.01169.pdf 

# Inception v3
Szegedy, Christian, et al. "Rethinking the inception architecture for computer vision." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016.
https://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Szegedy_Rethinking_the_Inception_CVPR_2016_paper.pdf

# Inception v4
Szegedy, Christian, et al. "Inception-v4, inception-resnet and the impact of residual connections on learning." AAAI. Vol. 4. 2017.
http://www.aaai.org/ocs/index.php/AAAI/AAAI17/paper/download/14806/14311

# PolyNet
Zhang, Xingcheng, et al. "Polynet: A pursuit of structural diversity in very deep networks." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017.
http://openaccess.thecvf.com/content_cvpr_2017/papers/Zhang_PolyNet_A_Pursuit_CVPR_2017_paper.pdf

-----

Mobile

-----

# SqueezeNet
Iandola, Forrest N., et al. "SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 0.5 MB model size." arXiv preprint arXiv:1602.07360 (2016).
https://arxiv.org/pdf/1602.07360.pdf

# MobileNet v1

# MobileNet v2

# MobileNet v3

# ShuffleNet v1
Zhang, Xiangyu, et al. "Shufflenet: An extremely efficient convolutional neural network for mobile devices." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018.
http://openaccess.thecvf.com/content_cvpr_2018/papers/Zhang_ShuffleNet_An_Extremely_CVPR_2018_paper.pdf

# ShuffleNet v2
Ma, Ningning, et al. "Shufflenet v2: Practical guidelines for efficient cnn architecture design." Proceedings of the European Conference on Computer Vision (ECCV). 2018.
http://openaccess.thecvf.com/content_ECCV_2018/papers/Ningning_Light-weight_CNN_Architecture_ECCV_2018_paper.pdf

# Xception
Chollet, François. "Xception: Deep learning with depthwise separable convolutions." Proceedings of the IEEE conference on computer vision and pattern recognition. 2017.
http://openaccess.thecvf.com/content_cvpr_2017/papers/Chollet_Xception_Deep_Learning_CVPR_2017_paper.pdf

-----

// NAS-RL

// NASNet

// pNASNet

// AmoebaNet

// mNASNet
 
// EfficientNet
Tan, Mingxing, and Quoc V. Le. "EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks." arXiv preprint arXiv:1905.11946 (2019).
https://arxiv.org/pdf/1905.11946.pdf

-----

Semantic Segmentation

-----

// SDS
Hariharan, Bharath, et al. "Simultaneous detection and segmentation." European Conference on Computer Vision. Springer, Cham, 2014.
https://arxiv.org/pdf/1407.1808.pdf

# FCN
Long, Jonathan, Evan Shelhamer, and Trevor Darrell. "Fully convolutional networks for semantic segmentation." Proceedings of the IEEE conference on computer vision and pattern recognition. 2015.
https://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Long_Fully_Convolutional_Networks_2015_CVPR_paper.pdf

# DeconvNet
Noh, Hyeonwoo, Seunghoon Hong, and Bohyung Han. "Learning deconvolution network for semantic segmentation." Proceedings of the IEEE international conference on computer vision. 2015.
https://www.cv-foundation.org/openaccess/content_iccv_2015/papers/Noh_Learning_Deconvolution_Network_ICCV_2015_paper.pdf

# SegNet
Badrinarayanan, Vijay, Alex Kendall, and Roberto Cipolla. "Segnet: A deep convolutional encoder-decoder architecture for image segmentation." IEEE transactions on pattern analysis and machine intelligence 39.12 (2017): 2481-2495.
https://arxiv.org/pdf/1511.00561.pdf

# U-Net
Ronneberger, Olaf, Philipp Fischer, and Thomas Brox. "U-net: Convolutional networks for biomedical image segmentation." International Conference on Medical image computing and computer-assisted intervention. Springer, Cham, 2015.
https://arxiv.org/pdf/1505.04597.pdf

# U-Net++
Zhou, Zongwei, et al. "Unet++: A nested u-net architecture for medical image segmentation." Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support. Springer, Cham, 2018. 3-11.
https://arxiv.org/pdf/1807.10165.pdf

# DilatedNet
Yu, Fisher, and Vladlen Koltun. "Multi-scale context aggregation by dilated convolutions." arXiv preprint arXiv:1511.07122 (2015).
https://arxiv.org/pdf/1511.07122.pdf 

# ENet
Paszke, Adam, et al. "Enet: A deep neural network architecture for real-time semantic segmentation." arXiv preprint arXiv:1606.02147 (2016).
https://arxiv.org/pdf/1606.02147.pdf
 
# DRN
Yu, Fisher, Vladlen Koltun, and Thomas Funkhouser. "Dilated residual networks." Proceedings of the IEEE conference on computer vision and pattern recognition. 2017.
http://openaccess.thecvf.com/content_cvpr_2017/papers/Yu_Dilated_Residual_Networks_CVPR_2017_paper.pdf

# FC-CRF
Krähenbühl, Philipp, and Vladlen Koltun. "Efficient inference in fully connected crfs with gaussian edge potentials." Advances in neural information processing systems. 2011.
http://papers.nips.cc/paper/4296-efficient-inference-in-fully-connected-crfs-with-gaussian-edge-potentials.pdf

# DeepLab v1
Chen, Liang-Chieh, et al. "Semantic image segmentation with deep convolutional nets and fully connected crfs." arXiv preprint arXiv:1412.7062 (2014).
https://arxiv.org/pdf/1412.7062.pdf

# DeepLab v2
Chen, Liang-Chieh, et al. "Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs." arXiv preprint arXiv:1606.00915 (2016).
https://arxiv.org/pdf/1606.00915.pdf 

# DeepLab v3
Chen, Liang-Chieh, et al. "Rethinking atrous convolution for semantic image segmentation." arXiv preprint arXiv:1706.05587 (2017).
https://arxiv.org/pdf/1706.05587.pdf  

# DeepLab v3+
Chen, Liang-Chieh, et al. "Encoder-decoder with atrous separable convolution for semantic image segmentation." Proceedings of the European conference on computer vision (ECCV). 2018.
http://openaccess.thecvf.com/content_ECCV_2018/papers/Liang-Chieh_Chen_Encoder-Decoder_with_Atrous_ECCV_2018_paper.pdf

# ResNet-38
Wu, Zifeng, Chunhua Shen, and Anton Van Den Hengel. "Wider or deeper: Revisiting the resnet model for visual recognition." Pattern Recognition 90 (2019): 119-133.
https://arxiv.org/pdf/1611.10080.pdf

# RefineNet
Lin, Guosheng, et al. "Refinenet: Multi-path refinement networks for high-resolution semantic segmentation." Proceedings of the IEEE conference on computer vision and pattern recognition. 2017.
http://openaccess.thecvf.com/content_cvpr_2017/papers/Lin_RefineNet_Multi-Path_Refinement_CVPR_2017_paper.pdf 

# RefineNet-LW
Nekrasov, Vladimir, Chunhua Shen, and Ian Reid. "Light-weight refinenet for real-time semantic segmentation." arXiv preprint arXiv:1810.03272 (2018).
https://arxiv.org/pdf/1810.03272.pdf

# RefineNet-AA
Nekrasov, Vladimir, et al. "Real-time joint semantic segmentation and depth estimation using asymmetric annotations." 2019 International Conference on Robotics and Automation (ICRA). IEEE, 2019.
https://arxiv.org/pdf/1809.04766.pdf

# PSPNet
Zhao, Hengshuang, et al. "Pyramid scene parsing network." Proceedings of the IEEE conference on computer vision and pattern recognition. 2017.
http://openaccess.thecvf.com/content_cvpr_2017/papers/Zhao_Pyramid_Scene_Parsing_CVPR_2017_paper.pdf

# ICNet
Zhao, Hengshuang, et al. "Icnet for real-time semantic segmentation on high-resolution images." Proceedings of the European Conference on Computer Vision (ECCV). 2018.
http://openaccess.thecvf.com/content_ECCV_2018/papers/Hengshuang_Zhao_ICNet_for_Real-Time_ECCV_2018_paper.pdf

# BiSeNet
Yu, Changqian, et al. "Bisenet: Bilateral segmentation network for real-time semantic segmentation." Proceedings of the European Conference on Computer Vision (ECCV). 2018.
http://openaccess.thecvf.com/content_ECCV_2018/papers/Changqian_Yu_BiSeNet_Bilateral_Segmentation_ECCV_2018_paper.pdf

# Fast-SCNN
Poudel, Rudra PK, Stephan Liwicki, and Roberto Cipolla. "Fast-SCNN: fast semantic segmentation network." arXiv preprint arXiv:1902.04502 (2019).
https://arxiv.org/pdf/1902.04502.pdf

# BlitzNet
Dvornik, Nikita, et al. "Blitznet: A real-time deep network for scene understanding." Proceedings of the IEEE international conference on computer vision. 2017.
http://openaccess.thecvf.com/content_ICCV_2017/papers/Dvornik_BlitzNet_A_Real-Time_ICCV_2017_paper.pdf

// ESPNet v1

// ESPNet v2

// Auto-DeepLab

// SA-GAN

// DANet

// OCNet
  
-----

◎ Object Detection

-----

// SVM

// SMO
Platt, John. "Sequential minimal optimization: A fast algorithm for training support vector machines." (1998).
https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/tr-98-14.pdf

// SIFT

// HOG
 
// DPM

-----

# DPM
Felzenszwalb, Pedro F., et al. "Object detection with discriminatively trained part-based models." IEEE transactions on pattern analysis and machine intelligence 32.9 (2010): 1627-1645.
https://ttic.uchicago.edu/~dmcallester/lsvm-pami.pdf

# SS
Uijlings, Jasper RR, et al. "Selective search for object recognition." International journal of computer vision 104.2 (2013): 154-171.
https://ivi.fnwi.uva.nl/isis/publications/2013/UijlingsIJCV2013/UijlingsIJCV2013.pdf

# R-CNN
Girshick, Ross, et al. "Rich feature hierarchies for accurate object detection and semantic segmentation." Proceedings of the IEEE conference on computer vision and pattern recognition. 2014.
https://www.cv-foundation.org/openaccess/content_cvpr_2014/papers/Girshick_Rich_Feature_Hierarchies_2014_CVPR_paper.pdf?spm=5176.100239.blogcont55892.8.pm8zm1&file=Girshick_Rich_Feature_Hierarchies_2014_CVPR_paper.pdf

# SPPNet
He, Kaiming, et al. "Spatial pyramid pooling in deep convolutional networks for visual recognition." european conference on computer vision. Springer, Cham, 2014.
https://arxiv.org/pdf/1406.4729.pdf
 
# Fast R-CNN
Girshick, Ross. "Fast R-CNN." Proceedings of the IEEE international conference on computer vision. 2015.
http://openaccess.thecvf.com/content_iccv_2015/papers/Girshick_Fast_R-CNN_ICCV_2015_paper.pdf

# Faster R-CNN
Ren, Shaoqing, et al. "Faster R-CNN: Towards real-time object detection with region proposal networks." Advances in neural information processing systems. 2015.
http://papers.nips.cc/paper/5638-faster-r-cnn-towards-real-time-object-detection-with-region-proposal-networks.pdf

# OverFeat
Sermanet, Pierre, et al. "Overfeat: Integrated recognition, localization and detection using convolutional networks." arXiv preprint arXiv:1312.6229 (2013).
https://arxiv.org/pdf/1312.6229.pdf
 
# YOLO v1
Redmon, Joseph, et al. "You only look once: Unified, real-time object detection." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016.
https://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Redmon_You_Only_Look_CVPR_2016_paper.pdf

# SSD
Liu, Wei, et al. "SSD: Single shot multibox detector." European conference on computer vision. Springer, Cham, 2016.
https://arxiv.org/pdf/1512.02325.pdf

# DSSD
Fu, Cheng-Yang, et al. "Dssd: Deconvolutional single shot detector." arXiv preprint arXiv:1701.06659 (2017).
https://arxiv.org/pdf/1701.06659.pdf

# YOLO v2
Redmon, Joseph, and Ali Farhadi. "YOLO9000: better, faster, stronger." arXiv preprint (2017).

# ION
Bell, Sean, et al. "Inside-outside net: Detecting objects in context with skip pooling and recurrent neural networks." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
http://openaccess.thecvf.com/content_cvpr_2016/papers/Bell_Inside-Outside_Net_Detecting_CVPR_2016_paper.pdf

# R-FCN
Dai, Jifeng, et al. "R-fcn: Object detection via region-based fully convolutional networks." Advances in neural information processing systems. 2016.
https://papers.nips.cc/paper/6465-r-fcn-object-detection-via-region-based-fully-convolutional-networks.pdf

# SATO
Huang, Jonathan, et al. "Speed/accuracy trade-offs for modern convolutional object detectors." Proceedings of the IEEE conference on computer vision and pattern recognition. 2017.
http://openaccess.thecvf.com/content_cvpr_2017/papers/Huang_SpeedAccuracy_Trade-Offs_for_CVPR_2017_paper.pdf

# DCN v1
Dai, Jifeng, et al. "Deformable convolutional networks." Proceedings of the IEEE international conference on computer vision. 2017.
http://openaccess.thecvf.com/content_ICCV_2017/papers/Dai_Deformable_Convolutional_Networks_ICCV_2017_paper.pdf

# DCN v2
Zhu, Xizhou, et al. "Deformable convnets v2: More deformable, better results." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2019.
http://openaccess.thecvf.com/content_CVPR_2019/papers/Zhu_Deformable_ConvNets_V2_More_Deformable_Better_Results_CVPR_2019_paper.pdf

# Cascade R-CNN
Cai, Zhaowei, and Nuno Vasconcelos. "Cascade r-cnn: Delving into high quality object detection." Proceedings of the IEEE conference on computer vision and pattern recognition. 2018.
http://openaccess.thecvf.com/content_cvpr_2018/papers/Cai_Cascade_R-CNN_Delving_CVPR_2018_paper.pdf   

# FPN
Lin, Tsung-Yi, et al. "Feature pyramid networks for object detection." CVPR. Vol. 1. No. 2. 2017.
http://openaccess.thecvf.com/content_cvpr_2017/papers/Lin_Feature_Pyramid_Networks_CVPR_2017_paper.pdf 

# STDN
Zhou, Peng, et al. "Scale-transferrable object detection." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018.
http://openaccess.thecvf.com/content_cvpr_2018/papers/Zhou_Scale-Transferrable_Object_Detection_CVPR_2018_paper.pdf

# YOLO v3
YOLOv3: An Incremental Improvement
https://pjreddie.com/media/files/papers/YOLOv3.pdf 

# RON
Kong, Tao, et al. "Ron: Reverse connection with objectness prior networks for object detection." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017.
http://openaccess.thecvf.com/content_cvpr_2017/papers/Kong_RON_Reverse_Connection_CVPR_2017_paper.pdf 

# RefineDet
Zhang, Shifeng, et al. "Single-shot refinement neural network for object detection." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018.
http://openaccess.thecvf.com/content_cvpr_2018/papers/Zhang_Single-Shot_Refinement_Neural_CVPR_2018_paper.pdf
 
# M2Det
Zhao, Qijie, et al. "M2det: A single-shot object detector based on multi-level feature pyramid network." Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 33. 2019.
https://arxiv.org/pdf/1811.04533.pdf

# SNIP
Singh, Bharat, and Larry S. Davis. "An analysis of scale invariance in object detection snip." Proceedings of the IEEE conference on computer vision and pattern recognition. 2018.
http://openaccess.thecvf.com/content_cvpr_2018/papers/Singh_An_Analysis_of_CVPR_2018_paper.pdf

# SNIPER
Singh, Bharat, Mahyar Najibi, and Larry S. Davis. "SNIPER: Efficient multi-scale training." Advances in Neural Information Processing Systems. 2018.
http://papers.nips.cc/paper/8143-sniper-efficient-multi-scale-training.pdf

# AutoFocus
Najibi, Mahyar, Bharat Singh, and Larry S. Davis. "Autofocus: Efficient multi-scale inference." Proceedings of the IEEE International Conference on Computer Vision. 2019.
http://openaccess.thecvf.com/content_ICCV_2019/papers/Najibi_AutoFocus_Efficient_Multi-Scale_Inference_ICCV_2019_paper.pdf

# DetNet
Li, Zeming, et al. "Detnet: Design backbone for object detection." Proceedings of the European Conference on Computer Vision (ECCV). 2018.
http://openaccess.thecvf.com/content_ECCV_2018/papers/Zeming_Li_DetNet_Design_Backbone_ECCV_2018_paper.pdf

# TridentNet
Li, Yanghao, et al. "Scale-aware trident networks for object detection." arXiv preprint arXiv:1901.01892 (2019).
https://arxiv.org/pdf/1901.01892.pdf

# OHEM
Shrivastava, Abhinav, Abhinav Gupta, and Ross Girshick. "Training region-based object detectors with online hard example mining." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
https://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Shrivastava_Training_Region-Based_Object_CVPR_2016_paper.pdf

# Focal Loss
Lin, Tsung-Yi, et al. "Focal loss for dense object detection." IEEE transactions on pattern analysis and machine intelligence (2018).
https://vision.cornell.edu/se3/wp-content/uploads/2017/09/focal_loss.pdf
https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=8417976

# GHM
Li, Buyu, Yu Liu, and Xiaogang Wang. "Gradient harmonized single-stage detector." Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 33. 2019.
https://arxiv.org/pdf/1811.05181.pdf

# Libra R-CNN
Pang, Jiangmiao, et al. "Libra r-cnn: Towards balanced learning for object detection." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2019.
http://openaccess.thecvf.com/content_CVPR_2019/papers/Pang_Libra_R-CNN_Towards_Balanced_Learning_for_Object_Detection_CVPR_2019_paper.pdf

# DCR v1
Cheng, Bowen, et al. "Revisiting rcnn: On awakening the classification power of faster rcnn." Proceedings of the European Conference on Computer Vision (ECCV). 2018.
http://openaccess.thecvf.com/content_ECCV_2018/papers/Bowen_Cheng_Revisiting_RCNN_On_ECCV_2018_paper.pdf
 
# DCR v2
Cheng, Bowen, et al. "Decoupled classification refinement: Hard false positive suppression for object detection." arXiv preprint arXiv:1810.04002 (2018).
https://arxiv.org/pdf/1810.04002.pdf

# PISA
Cao, Yuhang, et al. "Prime Sample Attention in Object Detection." arXiv preprint arXiv:1904.04821 (2019).
https://arxiv.org/pdf/1904.04821.pdf

-----

// CornerNet
Law, Hei, and Jia Deng. "Cornernet: Detecting objects as paired keypoints." Proceedings of the European Conference on Computer Vision (ECCV). 2018.
http://openaccess.thecvf.com/content_ECCV_2018/papers/Hei_Law_CornerNet_Detecting_Objects_ECCV_2018_paper.pdf
 
// CenterNet
Duan, Kaiwen, et al. "CenterNet: Object Detection with Keypoint Triplets." arXiv preprint arXiv:1904.08189 (2019).
https://arxiv.org/pdf/1904.08189.pdf
 
// SelectNet
Liu, Yunru, Tingran Gao, and Haizhao Yang. "SelectNet: Learning to Sample from the Wild for Imbalanced Data Training." arXiv preprint arXiv:1905.09872 (2019).
https://arxiv.org/pdf/1905.09872.pdf

// Bottom-up
Zhou, Xingyi, Jiacheng Zhuo, and Philipp Krahenbuhl. "Bottom-up object detection by grouping extreme and center points." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2019.
http://openaccess.thecvf.com/content_CVPR_2019/papers/Zhou_Bottom-Up_Object_Detection_by_Grouping_Extreme_and_Center_Points_CVPR_2019_paper.pdf 

-----

Instance Segmentation 

-----

# MNC
Dai, Jifeng, Kaiming He, and Jian Sun. "Instance-aware semantic segmentation via multi-task network cascades." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016.
https://www.cv-foundation.org/openaccess/content_cvpr_2016/papers/Dai_Instance-Aware_Semantic_Segmentation_CVPR_2016_paper.pdf

// DeepMask
Pinheiro, Pedro O., Ronan Collobert, and Piotr Dollár. "Learning to segment object candidates." Advances in Neural Information Processing Systems. 2015.
https://papers.nips.cc/paper/5852-learning-to-segment-object-candidates.pdf

// SharpMask
Pinheiro, Pedro O., et al. "Learning to refine object segments." European Conference on Computer Vision. Springer, Cham, 2016.
https://arxiv.org/pdf/1603.08695.pdf

// MultiPathNet
Zagoruyko, Sergey, et al. "A multipath network for object detection." arXiv preprint arXiv:1604.02135 (2016).
https://arxiv.org/pdf/1604.02135.pdf

# InstanceFCN
Dai, Jifeng, et al. "Instance-sensitive fully convolutional networks." European Conference on Computer Vision. Springer, Cham, 2016.
https://arxiv.org/pdf/1603.08678.pdf

# FCIS
Li, Yi, et al. "Fully convolutional instance-aware semantic segmentation." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017.
http://openaccess.thecvf.com/content_cvpr_2017/papers/Li_Fully_Convolutional_Instance-Aware_CVPR_2017_paper.pdf

# Mask R-CNN
He, Kaiming, et al. "Mask r-cnn." Computer Vision (ICCV), 2017 IEEE International Conference on. IEEE, 2017.
http://openaccess.thecvf.com/content_ICCV_2017/papers/He_Mask_R-CNN_ICCV_2017_paper.pdf

-----

◎ Face Detection

-----





-----

◎ Face Recognition

-----

# DeepFace

# DeepID

# VGGFace

-----

# FaceNet
Schroff, Florian, Dmitry Kalenichenko, and James Philbin. "Facenet: A unified embedding for face recognition and clustering." Proceedings of the IEEE conference on computer vision and pattern recognition. 2015.
https://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Schroff_FaceNet_A_Unified_2015_CVPR_paper.pdf

# LMNN
Weinberger, Kilian Q., John Blitzer, and Lawrence K. Saul. "Distance metric learning for large margin nearest neighbor classification." Advances in neural information processing systems. 2006.
http://papers.nips.cc/paper/2795-distance-metric-learning-for-large-margin-nearest-neighbor-classification.pdf

-----

# Center Loss

# Sphere Face(A-softmax)

# CosFace(AM-softmax)

# ArcFace

# MobileID

# MobileFace

# OpenFace

# SeetaFace

-----

Visual Tracking  

-----

Part II - Natural Language Processing

-----

LSTM

-----

// RNN(Recurrent Neural Network)
Elman, Jeffrey L. "Finding structure in time." Cognitive science 14.2 (1990): 179-211.
http://www2.fiit.stuba.sk/~kvasnicka/NeuralNetworks/6.prednaska/Elman_SRNN_paper.pdf

# LSTM(Long Short-Term Memory)
Hochreiter, Sepp, and Jürgen Schmidhuber. "Long short-term memory." Neural computation 9.8 (1997): 1735-1780.
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.676.4320&rep=rep1&type=pdf

// BRNN(Bidirectional RNN)
Schuster, Mike, and Kuldip K. Paliwal. "Bidirectional recurrent neural networks." IEEE Transactions on Signal Processing 45.11 (1997): 2673-2681.
http://www.cs.cmu.edu/afs/cs/user/bhiksha/WWW/courses/deeplearning/Fall.2016/pdfs/Schuster97_BRNN.pdf

// BLSTM(Bidirectional LSTM)
Graves, Alex, Navdeep Jaitly, and Abdel-rahman Mohamed. "Hybrid speech recognition with deep bidirectional LSTM." Automatic Speech Recognition and Understanding (ASRU), 2013 IEEE Workshop on. IEEE, 2013.
http://www.cs.toronto.edu/~graves/asru_2013.pdf

# GRU(Gated Recurrent Unit)
Cho, Kyunghyun, et al. "Learning phrase representations using RNN encoder-decoder for statistical machine translation." arXiv preprint arXiv:1406.1078 (2014).
https://arxiv.org/pdf/1406.1078.pdf
 
// MGU(Minimal Gated Unit)
Zhou, Guo-Bing, et al. "Minimal gated unit for recurrent neural networks." International Journal of Automation and Computing 13.3 (2016): 226-234.
https://arxiv.org/pdf/1603.09420.pdf

// SRU(Simple Recurrent Unit)
Lei, Tao, et al. "Simple recurrent units for highly parallelizable recurrence." Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. 2018.
https://arxiv.org/pdf/1709.02755.pdf

// Comparison of LSTM, GRU, MGU, and SRU
Hou, Bo-Jian, and Zhi-Hua Zhou. "Learning with Interpretable Structure from RNN." arXiv preprint arXiv:1810.10708 (2018).
https://arxiv.org/pdf/1810.10708.pdf 

-----

# NNLM
Bengio, Yoshua, et al. "A neural probabilistic language model." Journal of machine learning research 3.Feb (2003): 1137-1155.
http://www.jmlr.org/papers/volume3/bengio03a/bengio03a.pdf 

# Word2vec
Mikolov, Tomas, et al. "Efficient estimation of word representations in vector space." arXiv preprint arXiv:1301.3781 (2013).
https://arxiv.org/pdf/1301.3781.pdf

// Hierarchical Softmax and Negative Sampling
Mikolov, Tomas, et al. "Distributed representations of words and phrases and their compositionality." Advances in neural information processing systems. 2013.
https://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf

// Doc2vec
Le, Quoc, and Tomas Mikolov. "Distributed representations of sentences and documents." International conference on machine learning. 2014.
http://proceedings.mlr.press/v32/le14.pdf

// Word2vec Explained
Goldberg, Yoav, and Omer Levy. "word2vec Explained: deriving Mikolov et al.'s negative-sampling word-embedding method." arXiv preprint arXiv:1402.3722 (2014).
https://arxiv.org/pdf/1402.3722.pdf

// Word2vec Learning
Rong, Xin. "word2vec parameter learning explained." arXiv preprint arXiv:1411.2738 (2014).
https://arxiv.org/pdf/1411.2738.pdf
 
-----

Seq2seq

-----

# Seq2seq - using LSTM
Sutskever, Ilya, Oriol Vinyals, and Quoc V. Le. "Sequence to sequence learning with neural networks." Advances in neural information processing systems. 2014.
http://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf 

-----

# GloVe
Pennington, Jeffrey, Richard Socher, and Christopher Manning. "Glove: Global vectors for word representation." Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 2014.
https://www.aclweb.org/anthology/D14-1162

# fastText
Joulin, Armand, et al. "Bag of tricks for efficient text classification." arXiv preprint arXiv:1607.01759 (2016).
https://arxiv.org/pdf/1607.01759.pdf

# Skip-Thought
Kiros, Ryan, et al. "Skip-thought vectors." Advances in neural information processing systems. 2015.
https://papers.nips.cc/paper/5950-skip-thought-vectors.pdf

# Quick-Thought
Logeswaran, Lajanugen, and Honglak Lee. "An efficient framework for learning sentence representations." arXiv preprint arXiv:1803.02893 (2018).
https://arxiv.org/pdf/1803.02893.pdf

# InferSent
Conneau, Alexis, et al. "Supervised learning of universal sentence representations from natural language inference data." arXiv preprint arXiv:1705.02364 (2017).
https://arxiv.org/pdf/1705.02364.pdf


-----

Attention

-----

# Attention - using GRU
Bahdanau, Dzmitry, Kyunghyun Cho, and Yoshua Bengio. "Neural machine translation by jointly learning to align and translate." arXiv preprint arXiv:1409.0473 (2014).
https://arxiv.org/pdf/1409.0473.pdf

# Attention - using LSTM
Luong, Minh-Thang, Hieu Pham, and Christopher D. Manning. "Effective approaches to attention-based neural machine translation." arXiv preprint arXiv:1508.04025 (2015).
https://arxiv.org/pdf/1508.04025.pdf 

# Visual Attention
Xu, Kelvin, et al. "Show, attend and tell: Neural image caption generation with visual attention." International conference on machine learning. 2015.
http://proceedings.mlr.press/v37/xuc15.pdf

-----

NTM & Memory

-----

# NTM

Graves, Alex, Greg Wayne, and Ivo Danihelka. "Neural turing machines." arXiv preprint arXiv:1410.5401 (2014).
https://arxiv.org/pdf/1410.5401.pdf

// MN
J Weston, S Chopra, and A Bordes. Memory networks. ICLR, 2014.
https://arxiv.org/abs/1410.3916

// EEMN
Sukhbaatar, Sainbayar, Jason Weston, and Rob Fergus. "End-to-end memory networks." Advances in neural information processing systems. 2015.
https://papers.nips.cc/paper/5846-end-to-end-memory-networks.pdf

// KVMN
Miller, Alexander, et al. "Key-value memory networks for directly reading documents." arXiv preprint arXiv:1606.03126 (2016).
https://arxiv.org/pdf/1606.03126.pdf

// PN
Vinyals, Oriol, Meire Fortunato, and Navdeep Jaitly. "Pointer networks." Advances in Neural Information Processing Systems. 2015.
http://papers.nips.cc/paper/5866-pointer-networks.pdf

// Set2set
Vinyals, Oriol, Samy Bengio, and Manjunath Kudlur. "Order matters: Sequence to sequence for sets." arXiv preprint arXiv:1511.06391 (2015).
https://arxiv.org/pdf/1511.06391.pdf

// RL NTM
Zaremba, Wojciech, and Ilya Sutskever. "Reinforcement learning neural turing machines-revised." arXiv preprint arXiv:1505.00521 (2015).
https://arxiv.org/pdf/1505.00521.pdf

// Hybrid Computing
Graves, Alex, et al. "Hybrid computing using a neural network with dynamic external memory." Nature 538.7626 (2016): 471.
https://campus.swarma.org/public/ueditor/php/upload/file/20170609/1497019302822809.pdf

// Implementing NTM
Collier, Mark, and Joeran Beel. "Implementing Neural Turing Machines." International Conference on Artificial Neural Networks. Springer, Cham, 2018.
https://arxiv.org/pdf/1807.08518.pdf

-----

ConvS2S

-----

// FSA
Daniluk, Michał, et al. "Frustratingly short attention spans in neural language modeling." arXiv preprint arXiv:1702.04521 (2017).
https://arxiv.org/pdf/1702.04521.pdf

// MHA
Iida, Shohei, et al. "A Multi-Hop Attention for RNN based Neural Machine Translation." Proceedings of The 8th Workshop on Patent and Scientific Literature Translation. 2019.
https://www.aclweb.org/anthology/W19-7203

// AOH
Iida, Shohei, et al. "Attention over Heads: A Multi-Hop Attention for Neural Machine Translation." Proceedings of the 57th Conference of the Association for Computational Linguistics: Student Research Workshop. 2019.
https://www.aclweb.org/anthology/P19-2030

// GLU
Dauphin, Yann N., et al. "Language modeling with gated convolutional networks." Proceedings of the 34th International Conference on Machine Learning-Volume 70. JMLR. org, 2017.
https://arxiv.org/pdf/1612.08083.pdf

# ConvS2S
Gehring, Jonas, et al. "Convolutional sequence to sequence learning." Proceedings of the 34th International Conference on Machine Learning-Volume 70. JMLR. org, 2017.
https://arxiv.org/pdf/1705.03122.pdf

-----

# ELMo
Peters, Matthew E., et al. "Deep contextualized word representations." arXiv preprint arXiv:1802.05365 (2018).
https://arxiv.org/pdf/1802.05365.pdf 

# AWD-LSTM
Merity, Stephen, Nitish Shirish Keskar, and Richard Socher. "Regularizing and optimizing LSTM language models." arXiv preprint arXiv:1708.02182 (2017).
https://arxiv.org/pdf/1708.02182.pdf

# ULMFiT
Howard, Jeremy, and Sebastian Ruder. "Universal language model fine-tuning for text classification." arXiv preprint arXiv:1801.06146 (2018).
https://arxiv.org/pdf/1801.06146.pdf

-----

Transformer

-----

# Transformer
Vaswani, Ashish, et al. "Attention is all you need." Advances in Neural Information Processing Systems. 2017.
https://papers.nips.cc/paper/7181-attention-is-all-you-need.pdf

# BERT
Devlin, Jacob, et al. "Bert: Pre-training of deep bidirectional transformers for language understanding." arXiv preprint arXiv:1810.04805 (2018).
https://arxiv.org/pdf/1810.04805.pdf

// MTL 0
Baxter, Jonathan. "A model of inductive bias learning." Journal of artificial intelligence research 12 (2000): 149-198.
https://arxiv.org/pdf/1106.0245.pdf
  
// MTL 1
Collobert, Ronan, and Jason Weston. "A unified architecture for natural language processing: Deep neural networks with multitask learning." Proceedings of the 25th international conference on Machine learning. ACM, 2008.
http://www.thespermwhale.com/jaseweston/papers/unified_nlp.pdf

// MTL 2
Collobert, Ronan, et al. "Natural language processing (almost) from scratch." Journal of machine learning research 12.Aug (2011): 2493-2537.
http://www.jmlr.org/papers/volume12/collobert11a/collobert11a.pdf

// MTL3
Ruder, Sebastian. "An overview of multi-task learning in deep neural networks." arXiv preprint arXiv:1706.05098 (2017).
https://arxiv.org/pdf/1706.05098.pdf

-----

# GPT-1
Radford, Alec, et al. "Improving language understanding by generative pre-training." URL https://s3-us-west-2. amazonaws. com/openai-assets/research-covers/languageunsupervised/language understanding paper. pdf (2018).
https://www.cs.ubc.ca/~amuham01/LING530/papers/radford2018improving.pdf 
 
# Universal Transformers
Dehghani, Mostafa, et al. "Universal transformers." arXiv preprint arXiv:1807.03819 (2018).
https://arxiv.org/pdf/1807.03819.pdf

# Transformer XL
Dai, Zihang, et al. "Transformer-xl: Attentive language models beyond a fixed-length context." arXiv preprint arXiv:1901.02860 (2019).
https://arxiv.org/pdf/1901.02860.pdf
 
# MT-DNN
Liu, Xiaodong, et al. "Multi-Task Deep Neural Networks for Natural Language Understanding." arXiv preprint arXiv:1901.11504 (2019).
https://arxiv.org/pdf/1901.11504.pdf

# GPT-2
Vig, Jesse. "Visualizing Attention in Transformer-Based Language models." arXiv preprint arXiv:1904.02679 (2019).
https://arxiv.org/pdf/1904.02679.pdf

// ERNIE Baidu
Sun, Yu, et al. "ERNIE: Enhanced Representation through Knowledge Integration." arXiv preprint arXiv:1904.09223 (2019).
https://arxiv.org/pdf/1904.09223.pdf

// ERNIE THU
Zhang, Zhengyan, et al. "ERNIE: Enhanced Language Representation with Informative Entities." arXiv preprint arXiv:1905.07129 (2019).
https://arxiv.org/pdf/1905.07129.pdf

// XLMs Facebook
Lample, Guillaume, and Alexis Conneau. "Cross-lingual Language Model Pretraining." arXiv preprint arXiv:1901.07291 (2019).
https://arxiv.org/pdf/1901.07291.pdf

// LASER Facebook
Artetxe, Mikel, and Holger Schwenk. "Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual Transfer and Beyond." arXiv preprint arXiv:1812.10464 (2018).
https://arxiv.org/pdf/1812.10464.pdf

// MASS Microsoft
Song, Kaitao, et al. "Mass: Masked sequence to sequence pre-training for language generation." arXiv preprint arXiv:1905.02450 (2019).
https://arxiv.org/pdf/1905.02450.pdf

// UNILM Microsoft
Dong, Li, et al. "Unified Language Model Pre-training for Natural Language Understanding and Generation." arXiv preprint arXiv:1905.03197 (2019).
https://arxiv.org/pdf/1905.03197.pdf

// ON-LSTM
Shen, Yikang, et al. "Ordered neurons: Integrating tree structures into recurrent neural networks." arXiv preprint arXiv:1810.09536 (2018).
https://arxiv.org/pdf/1810.09536.pdf

// XLNet
Yang, Zhilin, et al. "XLNet: Generalized Autoregressive Pretraining for Language Understanding." arXiv preprint arXiv:1906.08237 (2019).
https://arxiv.org/pdf/1906.08237.pdf

-----

Part III:Fundamental Topics

-----

Back Propagation

-----

# Back Propagation
Alber, Maximilian, et al. "Backprop evolution." arXiv preprint arXiv:1808.02822 (2018).
https://arxiv.org/pdf/1808.02822.pdf

-----

Optimization

-----

# SGD
Bottou, Léon. "Stochastic gradient descent tricks." Neural networks: Tricks of the trade. Springer, Berlin, Heidelberg, 2012. 421-436.
https://www.microsoft.com/en-us/research/wp-content/uploads/2012/01/tricks-2012.pdf 

# Momentum
Sutskever, Ilya, et al. "On the importance of initialization and momentum in deep learning." International conference on machine learning. 2013.
http://proceedings.mlr.press/v28/sutskever13.pdf

# NAG
Nesterov, Y. "A method of solving a convex programming problem with convergence rate $$ O (\frac {1}{k^ 2}) $$ O (1k2)." Soviet Math. Dokl. Vol. 27.
http://mpawankumar.info/teaching/cdt-big-data/nesterov83.pdf
 
# AdaGrad
Duchi, John, Elad Hazan, and Yoram Singer. "Adaptive subgradient methods for online learning and stochastic optimization." Journal of Machine Learning Research 12.Jul (2011): 2121-2159.
http://www.jmlr.org/papers/volume12/duchi11a/duchi11a.pdf
  
# AdaDelta
Zeiler, Matthew D. "ADADELTA: an adaptive learning rate method." arXiv preprint arXiv:1212.5701 (2012).
https://arxiv.org/pdf/1212.5701.pdf

# RMSProp
Tieleman, Tijmen, and Geoffrey Hinton. "Lecture 6.5-rmsprop: Divide the gradient by a running average of its recent magnitude." COURSERA: Neural networks for machine learning 4.2 (2012): 26-31.
http://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf

# Adam
Kingma, Diederik P., and Jimmy Ba. "Adam: A method for stochastic optimization." arXiv preprint arXiv:1412.6980 (2014).
https://arxiv.org/pdf/1412.6980.pdf

# Nadam
Dozat, Timothy. "Incorporating nesterov momentum into adam." (2016).
https://openreview.net/pdf?id=OM0jvwB8jIp57ZJjtNEZ

# AMSGrad
Reddi, Sashank J., Satyen Kale, and Sanjiv Kumar. "On the convergence of adam and beyond." International Conference on Learning Representations. 2018.
http://www.satyenkale.com/papers/amsgrad.pdf 

# CLR
Smith, Leslie N. "Cyclical learning rates for training neural networks." 2017 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, 2017.
https://arxiv.org/pdf/1506.01186.pdf

# SGDR
Loshchilov, Ilya, and Frank Hutter. "Sgdr: Stochastic gradient descent with warm restarts." arXiv preprint arXiv:1608.03983 (2016).
https://arxiv.org/pdf/1608.03983.pdf

# AdamW
Loshchilov, Ilya, and Frank Hutter. "Decoupled weight decay regularization (2019)." arXiv preprint arXiv:1711.05101.
https://arxiv.org/pdf/1711.05101.pdf

# Super-Convergence
Smith, Leslie N., and Nicholay Topin. "Super-convergence: Very fast training of residual networks using large learning rates." (2018).
https://openreview.net/pdf?id=H1A5ztj3b

# Lookahead
Zhang, Michael, et al. "Lookahead Optimizer: k steps forward, 1 step back." Advances in Neural Information Processing Systems. 2019.
http://papers.nips.cc/paper/9155-lookahead-optimizer-k-steps-forward-1-step-back.pdf

#RAdam
Liu, Liyuan, et al. "On the variance of the adaptive learning rate and beyond." arXiv preprint arXiv:1908.03265 (2019).
https://arxiv.org/pdf/1908.03265.pdf

# ADMM
Boyd, Stephen, et al. "Distributed optimization and statistical learning via the alternating direction method of multipliers." Foundations and Trends® in Machine learning 3.1 (2011): 1-122.
https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.360.1664&rep=rep1&type=pdf

# ADMM-S
Taylor, Gavin, et al. "Training neural networks without gradients: A scalable admm approach." International conference on machine learning. 2016.
http://proceedings.mlr.press/v48/taylor16.pdf

# dlADMM
Wang, Junxiang, et al. "ADMM for Efficient Deep Learning with Global Convergence." arXiv preprint arXiv:1905.13611 (2019).
https://arxiv.org/pdf/1905.13611.pdf
 
-----

Regularization

-----

# L2
# Ridge Regression
# Weight Decay
Zhang, Guodong, et al. "Three mechanisms of weight decay regularization." arXiv preprint arXiv:1810.12281 (2018).
https://arxiv.org/pdf/1810.12281.pdf

// WD 1989
Hanson, Stephen José, and Lorien Y. Pratt. "Comparing biases for minimal network construction with back-propagation." Advances in neural information processing systems. 1989.
http://papers.nips.cc/paper/156-comparing-biases-for-minimal-network-construction-with-back-propagation.pdf

// WD 1992
Krogh, Anders, and John A. Hertz. "A simple weight decay can improve generalization." Advances in neural information processing systems. 1992.
http://papers.nips.cc/paper/563-a-simple-weight-decay-can-improve-generalization.pdf

# L1
# Lasso Regression

# Dropout
Srivastava, Nitish, et al. "Dropout: a simple way to prevent neural networks from overfitting." The Journal of Machine Learning Research 15.1 (2014): 1929-1958.
http://www.jmlr.org/papers/volume15/srivastava14a/srivastava14a.pdf

# Dropconnect
Wan, Li, et al. "Regularization of neural networks using dropconnect." International conference on machine learning. 2013.
http://proceedings.mlr.press/v28/wan13.pdf  

# Maxout
Goodfellow, Ian J., et al. "Maxout networks." arXiv preprint arXiv:1302.4389 (2013).
http://proceedings.mlr.press/v28/goodfellow13.pdf

# DropPath
Larsson, Gustav, Michael Maire, and Gregory Shakhnarovich. "Fractalnet: Ultra-deep neural networks without residuals." arXiv preprint arXiv:1605.07648 (2016).
https://arxiv.org/pdf/1605.07648.pdf

# Scheduled DropPath
Zoph, Barret, et al. "Learning transferable architectures for scalable image recognition." Proceedings of the IEEE conference on computer vision and pattern recognition. 2018.
http://openaccess.thecvf.com/content_cvpr_2018/papers/Zoph_Learning_Transferable_Architectures_CVPR_2018_paper.pdf

# Shake-Shake
Gastaldi, Xavier. "Shake-shake regularization." arXiv preprint arXiv:1705.07485 (2017).
https://arxiv.org/pdf/1705.07485.pdf

# ShakeDrop
Yamada, Yoshihiro, et al. "Shakedrop regularization for deep residual learning." arXiv preprint arXiv:1802.02375 (2018).
https://arxiv.org/pdf/1802.02375.pdf

# Spatial Dropout
Tompson, Jonathan, et al. "Efficient object localization using convolutional networks." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2015.
https://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Tompson_Efficient_Object_Localization_2015_CVPR_paper.pdf

# Variational Dropout
Kingma, Durk P., Tim Salimans, and Max Welling. "Variational dropout and the local reparameterization trick." Advances in Neural Information Processing Systems. 2015.
https://papers.nips.cc/paper/5666-variational-dropout-and-the-local-reparameterization-trick.pdf

# Information Dropout
Achille, Alessandro, and Stefano Soatto. "Information dropout: Learning optimal representations through noisy computation." IEEE transactions on pattern analysis and machine intelligence 40.12 (2018): 2897-2905.
http://www.vision.jhu.edu/teaching/learning/deeplearning18/assets/Achille_Soatto-18.pdf

# Zoneout
Krueger, David, et al. "Zoneout: Regularizing rnns by randomly preserving hidden activations." arXiv preprint arXiv:1606.01305 (2016).
https://arxiv.org/pdf/1606.01305.pdf
 
# Cutout
DeVries, Terrance, and Graham W. Taylor. "Improved regularization of convolutional neural networks with cutout." arXiv preprint arXiv:1708.04552 (2017).
https://arxiv.org/pdf/1708.04552.pdf

# Dropblock
Ghiasi, Golnaz, Tsung-Yi Lin, and Quoc V. Le. "Dropblock: A regularization method for convolutional networks." Advances in Neural Information Processing Systems. 2018.
http://papers.nips.cc/paper/8271-dropblock-a-regularization-method-for-convolutional-networks.pdf

-----

Normalization 

-----

# Batch Normalization
Ioffe, Sergey, and Christian Szegedy. "Batch normalization: Accelerating deep network training by reducing internal covariate shift." International conference on machine learning. 2015.
http://proceedings.mlr.press/v37/ioffe15.pdf

# Weight Normalization
Salimans, Tim, and Durk P. Kingma. "Weight normalization: A simple reparameterization to accelerate training of deep neural networks." Advances in Neural Information Processing Systems. 2016.
https://papers.nips.cc/paper/6114-weight-normalization-a-simple-reparameterization-to-accelerate-training-of-deep-neural-networks.pdf
 
# Layer Normalization
Ba, Jimmy Lei, Jamie Ryan Kiros, and Geoffrey E. Hinton. "Layer normalization." arXiv preprint arXiv:1607.06450 (2016).
https://arxiv.org/pdf/1607.06450.pdf

# Instance Normalization
Ulyanov, Dmitry, Andrea Vedaldi, and Victor Lempitsky. "Instance normalization: The missing ingredient for fast stylization." arXiv preprint arXiv:1607.08022 (2016).
https://arxiv.org/pdf/1607.08022.pdf

# Group Normalization
Wu, Yuxin, and Kaiming He. "Group normalization." Proceedings of the European Conference on Computer Vision (ECCV). 2018.
http://openaccess.thecvf.com/content_ECCV_2018/papers/Yuxin_Wu_Group_Normalization_ECCV_2018_paper.pdf

// Positional Normalization 
Li, Boyi, et al. "Positional Normalization." Advances in Neural Information Processing Systems. 2019.
http://papers.nips.cc/paper/8440-positional-normalization.pdf 

-----

Activation Function

-----

# ReLU 2000


# ReLU 2009


# Softplus


# LReLU


# PReLU


# ELU


# SELU


# GELU
 

# Swish
Ramachandran, Prajit, Barret Zoph, and Quoc V. Le. "Searching for activation functions." arXiv preprint arXiv:1710.05941 (2017).
https://arxiv.org/pdf/1710.05941.pdf

-----

Loss Function

-----

# Loss Function
Barron, Jonathan T. "A general and adaptive robust loss function." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2019.
http://openaccess.thecvf.com/content_CVPR_2019/papers/Barron_A_General_and_Adaptive_Robust_Loss_Function_CVPR_2019_paper.pdf

-----


Part IV

-----

Medicine

-----

// Heart failure
Golas, Sara Bersche, et al. "A machine learning model to predict the risk of 30-day readmissions in patients with heart failure: a retrospective analysis of electronic medical records data." BMC medical informatics and decision making 18.1 (2018): 44.
https://bmcmedinformdecismak.biomedcentral.com/track/pdf/10.1186/s12911-018-0620-z

// Urgent care
Zebin, Tahmina, and Thierry J. Chaussalet. "Design and implementation of a deep recurrent model for prediction of readmission in urgent care using electronic health records." 2019 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB). IEEE, 2019.
https://ueaeprints.uea.ac.uk/71957/1/PID5991805_readmission.pdf

Ashfaq, Awais, et al. "Readmission prediction using deep learning on electronic health records." Journal of biomedical informatics 97 (2019): 103256.
https://www.sciencedirect.com/science/article/pii/S1532046419301753

Rajkomar, Alvin, et al. "Scalable and accurate deep learning with electronic health records." NPJ Digital Medicine 1.1 (2018): 18.
https://www.ehidc.org/sites/default/files/resources/files/electronic%20health%20records.pdf
 
-----

References

# 參考書籍
[1] 書名:深度學習: Caffe 之經典模型詳解與實戰,ISBN:7121301180,作者:樂毅,出版社:電子工業,出版日期:2016-09-30.

# LSTM 論文的輔助教材
[2] Understanding LSTM Networks -- colah's blog
http://colah.github.io/posts/2015-08-Understanding-LSTMs/ 

# 三篇快速版(2017春)
[3] Deep Learning Paper
http://hemingwang.blogspot.com/2019/01/deep-learning-paper.html

# 二十篇慢速版(2018春)
[4] PyTorch(六):Seminar
http://hemingwang.blogspot.com/2018/01/pytorchseminar.html

# 三十篇基礎版(2019春)
[5] 30 Topics for Deep Learning
http://hemingwang.blogspot.com/2019/04/30-topics-for-deep-learning.html  

# 十篇精華版(2019夏)
[6] AI 三部曲(深度學習:從入門到精通)
https://hemingwang.blogspot.com/2019/05/trilogy.html

# 五十篇完整版(2019秋)
[7] AI從頭學(三九):Complete Works
http://hemingwang.blogspot.tw/2017/08/aicomplete-works.html

AI 從頭學(2021 年版)

AI 從頭學(2021 年版)

2019/09/11

AI Seminar 2019 Taipei
https://hemingwang.blogspot.com/2019/11/ai-seminar-2019-taipei.html

-----


Fig. 2021(圖片來源:Pixabay)。

-----

Python
https://hemingwang.blogspot.com/2019/02/python.html

-----

Part I:Computer Vision

◎ Image Classification

Stage 01:LeNet、(AlexNet、ZFNet)
Substage:Digital Image Processing
Bonus:Auto Differentiation
Bonus:Back Propagation
Bonus:Computational Graph

Stage 02:NIN、(SENet、GoogLeNet、VGGNet、PreVGGNet、Highway v1 v2)、(Inception v3 v4、PolyNet)

Stage 03:ResNet v1 v2、(ResNet-D、ResNet-E、ResNet-S、WRN、ResNeXt、DenseNet、DPN、DLA、Res2Net)

◎ Mobile
Substage:SqueezeNet、(MobileNet v1 v2 v3、ShuffleNet v1 v2、Xception)

◎ Semantic Segmentation

Stage 04:FCN、(DeconvNet、SegNet、U-Net、U-Net++、DilatedNet、ENet、DRN、FC-CRF、DeepLab v1 v2 v3 v3+、ResNet-38、RefineNet、RefineNet-LW、RefineNet-AA、PSPNet、ICNet、BiSeNet、Fast-SCNN、BlitzNet)

◎ Instance Segmentation
Substage:(Hypercolumn、MNC、DeepMask、SharpMask、MultiPathNet、InstanceFCN、FCIS)、Mask R-CNN、(MaskX R-CNN、MaskLab、PANet、HTC、RetinaMask、MS R-CNN、YOLACT)

◎ Object Detection

Stage 05:(DPM、SS、R-CNN、SPPNet、Fast R-CNN、OHEM、Faster R-CNN、OverFeat)、YOLOv1、(SSD、DSSD、YOLOv2、ION、R-FCN、SATO、DCNv1、DCNv2、Cascade R-CNN、FPN、STDN、YOLOv3、RON、RefineDet、M2Det、DetNet、TridentNet、OHEM、Focal Loss、GHM、Libra R-CNN、DCRv1、DCRv2、PISA)

◎ Face Detection

◎ Face Recognition

◎ One Shot

◎ Visual Tracking

◎ Dataset

-----

Part II:Natural Language Processing

◎ LSTM
Stage 06:LSTM、(NNLM、Word2vec)
Bonus:Optimization

◎ Seq2seq
Stage 07:Seq2seq、(GloVe、fastText)
Bonus:Regularization

◎ Attention
Stage 08:Attention、(NTM、KVMN)
Bonus:Normalization

◎ ConvS2S
Stage 09:ConvS2S、(ELMo、AWD-LSTM、ULMFiT)
Bonus:Activation Function

◎ Transformer
Stage 10:Transformer、(GPT-1、BERT、GPT-2)
Bonus:Loss Function 

----- 

Part III:Fundamental Topics

◎ Optimization(SGD、Momentum、NAG、AdaGrad、AdaDelta、RMSProp、Adam、Nadam、AMSGrad、CLR、SGDR、AdamW、Super-Convergence、Lookahead、RAdam、ADMM、ADMM-S、dlADMM)

◎ Regularization(L2、L1、L0、Dropout、Dropconnect、Maxout、DropPath、Scheduled DropPath、Shake-Shake、Shake-Drop、Spacial Dropout、Variational Dropout、Information Dropout、Zoneout、Cutout、Drop-block)

◎ Normalization(Batch、Weight、Layer、Instance、Group、Positional)

◎ Activation Function(sigmoid、tanh、ReLU、Softplus、LReLU、PReLU、ELU、SELU、GELU、Swish)

◎ Loss Function

◎ Automatic Differentiation
◎ Back Propagation
◎ Computational Graph

◎ Convolution

-----

Intelligence Science

-----

Intelligence Science
http://hemingwang.blogspot.com/2019/09/intelligence-science.html

-----


Part I:Computer Vision

-----

Computer Vision
https://hemingwang.blogspot.com/2019/10/computer-vision.html

https://hemingwang.blogspot.com/2019/10/gaussiansmooth.html

https://hemingwang.blogspot.com/2019/10/sobeledgedetection.html

https://hemingwang.blogspot.com/2019/10/structuretensor.html

https://hemingwang.blogspot.com/2019/10/nms.html

-----

◎ Image Classification

-----

Image Classification
https://hemingwang.blogspot.com/2019/10/image-classification.html

LeNet
https://hemingwang.blogspot.com/2019/05/trilogy.html
http://hemingwang.blogspot.com/2018/02/deep-learninglenet-bp.html
http://hemingwang.blogspot.com/2017/03/ailenet.html
http://hemingwang.blogspot.com/2017/03/ailenet-f6.html

AlexNet
http://hemingwang.blogspot.com/2017/05/aialexnet.html

ZFNet
http://hemingwang.blogspot.com/2017/05/aikernel-visualizing.html

NIN
http://hemingwang.blogspot.com/2017/06/ainetwork-in-network.html

SENet
https://hemingwang.blogspot.com/2019/10/senet.html

GoogLeNet
http://hemingwang.blogspot.com/2017/06/aigooglenet.html
http://hemingwang.blogspot.com/2017/06/aiconv1.html
http://hemingwang.blogspot.com/2017/08/aiinception.html

VGGNet
http://hemingwang.blogspot.com/2018/09/aivggnet.html 

PreVGGNet
https://hemingwang.blogspot.com/2019/11/prevggnet.html

Highway
http://hemingwang.blogspot.com/2019/11/highway.html

ResNet
https://hemingwang.blogspot.com/2019/05/vanishing-gradient.html
https://hemingwang.blogspot.com/2019/05/exploding-gradient.html
http://hemingwang.blogspot.com/2019/10/an-overview-of-resnet-and-its-variants.html
https://hemingwang.blogspot.com/2019/10/universal-approximation-theorem.html 
https://hemingwang.blogspot.com/2019/10/understanding-boxplots.html
https://hemingwang.blogspot.com/2019/10/ensemble-learning.html
http://hemingwang.blogspot.com/2018/09/airesnet.html

ResNet-D
https://hemingwang.blogspot.com/2019/11/resnet-d.html

ResNet-E
https://hemingwang.blogspot.com/2019/11/resnet-e.html

ResNet-S
https://hemingwang.blogspot.com/2019/11/resnet-s.html

WRN
https://hemingwang.blogspot.com/2019/11/wrn.html

ResNeXt
https://hemingwang.blogspot.com/2019/10/resnext.html

DenseNet
https://hemingwang.blogspot.com/2019/11/densenet.html

DPN
https://hemingwang.blogspot.com/2019/11/dpn.html

DLA
https://hemingwang.blogspot.com/2019/11/dla.html

Res2Net
https://hemingwang.blogspot.com/2019/11/res2net.html

Inception v3
https://hemingwang.blogspot.com/2019/11/inception-v3.html

Inception v4
https://hemingwang.blogspot.com/2019/11/inception-v4.html

PolyNet
https://hemingwang.blogspot.com/2019/11/polynet.html

----- 

◎ Mobile

-----

Mobile
https://hemingwang.blogspot.com/2019/10/mobile.html

SqueezeNet
https://hemingwang.blogspot.com/2019/10/squeezenet.html

MobileNet v1
https://hemingwang.blogspot.com/2019/10/mobilenet-v1.html

ShuffleNet

Xception
https://hemingwang.blogspot.com/2019/10/xception.html

-----

◎ Semantic  Segmentation

-----

Semantic Segmentation
https://hemingwang.blogspot.com/2019/01/semantic-segmentation.html

FCN
http://hemingwang.blogspot.com/2018/02/deep-learningfcn.html
https://hemingwang.blogspot.com/2019/11/fcn.html

DeconvNet
https://hemingwang.blogspot.com/2019/11/deconvnet.html 

SegNet
https://hemingwang.blogspot.com/2019/11/segnet.html

U-Net
https://hemingwang.blogspot.com/2019/10/u-net.html

U-Net++
https://hemingwang.blogspot.com/2019/11/u-net.html

DilatedNet
https://hemingwang.blogspot.com/2019/11/dilatednet.html

ENet
https://hemingwang.blogspot.com/2019/11/enet.html

DRN
https://hemingwang.blogspot.com/2019/11/drn.html

FC-CRF
https://hemingwang.blogspot.com/2019/11/fc-crf.html

DeepLab
https://hemingwang.blogspot.com/2019/10/deeplab.html

DeepLab v1
https://hemingwang.blogspot.com/2019/11/deeplab-v1.html

DeepLab v2
https://hemingwang.blogspot.com/2019/11/deeplab-v2.html

DeepLab v3
https://hemingwang.blogspot.com/2019/11/deeplab-v3.html

DeepLab v3+
https://hemingwang.blogspot.com/2019/11/deeplab-v3-plus.html

ResNet-38
https://hemingwang.blogspot.com/2019/11/resnet-38.html

RefineNet
https://hemingwang.blogspot.com/2019/11/refinenet.html

RefineNet-LW
https://hemingwang.blogspot.com/2019/11/refinenet-lw.html

RefineNet-AA
https://hemingwang.blogspot.com/2019/11/refinenet-aa.html

PSPNet
https://hemingwang.blogspot.com/2019/10/pspnet.html

ICNet
https://hemingwang.blogspot.com/2019/11/icnet.html

BiSeNet
https://hemingwang.blogspot.com/2019/11/bisenet.html

Fast-SCNN
https://hemingwang.blogspot.com/2019/11/fast-scnn.html

BlitzNet
https://hemingwang.blogspot.com/2019/11/blitznet.html

----- 

◎ Object Detection

-----
 
Object Detection
https://hemingwang.blogspot.com/2019/10/object-detection.html

DPM
https://hemingwang.blogspot.com/2019/11/dpm.html

SS
https://hemingwang.blogspot.com/2019/11/ss.html

R-CNN
https://hemingwang.blogspot.com/2019/11/r-cnn.html

SPPNet
https://hemingwang.blogspot.com/2019/11/sppnet.html

Fast R-CNN
https://hemingwang.blogspot.com/2019/11/fast-r-cnn.html

Faster R-CNN
https://hemingwang.blogspot.com/2019/09/faster-r-cnn.html 

OverFeat
https://hemingwang.blogspot.com/2019/11/overfeat.html

YOLOv1
http://hemingwang.blogspot.com/2018/04/deep-learningyolo-v1.html
http://hemingwang.blogspot.com/2018/04/machine-learning-conceptmean-average.html
http://hemingwang.blogspot.com/2018/04/machine-learning-conceptnon-maximum.html
https://hemingwang.blogspot.com/2019/11/yolo-v1.html

SSD
https://hemingwang.blogspot.com/2019/09/ssd.html 

DSSD
https://hemingwang.blogspot.com/2019/11/dssd.html 

YOLOv2
https://hemingwang.blogspot.com/2019/11/yolo-v2.html

ION
https://hemingwang.blogspot.com/2019/11/ion.html

R-FCN
https://hemingwang.blogspot.com/2019/11/r-fcn.html

SATO
https://hemingwang.blogspot.com/2019/10/sato.html 

DCNv1
https://hemingwang.blogspot.com/2019/12/dcn-v1.html

DCNv2
https://hemingwang.blogspot.com/2019/12/dcn-v2.html

Cascade R-CNN
https://hemingwang.blogspot.com/2019/12/cascade-r-cnn.html

FPN
https://hemingwang.blogspot.com/2019/11/fpn.html

STDN
https://hemingwang.blogspot.com/2019/12/stdn.html

YOLOv3
https://hemingwang.blogspot.com/2019/11/yolo-v3.html

RON
https://hemingwang.blogspot.com/2019/12/ron.html

RefineDet
https://hemingwang.blogspot.com/2019/11/refinedet.html

M2Det
https://hemingwang.blogspot.com/2019/10/m2det.html

SNIP
https://hemingwang.blogspot.com/2019/12/snip.html 

SNIPER
https://hemingwang.blogspot.com/2019/12/sniper.html

AutoFocus
https://hemingwang.blogspot.com/2019/12/autofocus.html

DetNet
https://hemingwang.blogspot.com/2019/12/detnet.html

TridentNet
https://hemingwang.blogspot.com/2019/12/tridentnet.html

OHEM
https://hemingwang.blogspot.com/2019/11/ohem.html

Focal Loss
https://hemingwang.blogspot.com/2019/10/retinanet.html

GHM
https://hemingwang.blogspot.com/2019/12/ghm.html

Libra R-CNN
https://hemingwang.blogspot.com/2019/12/libra-r-cnn.html

DCRv1
https://hemingwang.blogspot.com/2019/12/dcr-v1.html

DCRv2
https://hemingwang.blogspot.com/2019/12/dcr-v2.html

PISA
https://hemingwang.blogspot.com/2019/12/pisa.html

-----

◎ Instance Segmentation

-----

Hypercolumn

MNC

DeepMask

SharpMask

MultiPathNet

InstanceFCN

FCIS
https://hemingwang.blogspot.com/2019/10/fcis.html

Mask R-CNN
https://hemingwang.blogspot.com/2019/10/mask-r-cnn.html

MaskX R-CNN


MaskLab

PANet

HTC

RetinaMask

MS R-CNN

YOLACT

-----

◎ Face Recognition

-----

DeepFace

DeepID

MobileID

VGGFace

FaceNet

MobileFace

Center Loss

Sphere Face(A-softmax)

CosFace(AM-softmax)

ArcFace

OpenFace

SeetaFace

----- 

◎ Visual Tracking

-----



-----

◎ Dataset

-----

Dataset
https://hemingwang.blogspot.com/2019/10/dataset.html

CALTECH
CIFAR-10
PASCAL VO
COCO
MNIST
ILSVRC 14
Cityspace

-----
 
Part II:Natural Language Processing
 
-----

◎ LSTM

-----

http://hemingwang.blogspot.com/2019/09/understanding-lstm-networks.html
https://hemingwang.blogspot.com/2019/09/lstm.html

-----

◎ NNLM

-----



-----

◎ Word2vec 

-----




-----

◎ Seq2seq 

-----

http://hemingwang.blogspot.com/2019/10/word-level-english-to-marathi-neural.html
https://hemingwang.blogspot.com/2019/09/seq2seq.html


-----

◎ Attention

-----

http://hemingwang.blogspot.com/2019/10/attention-in-nlp.html
http://hemingwang.blogspot.com/2019/01/attention.html

-----

◎ ConvS2S 

-----

http://hemingwang.blogspot.com/2019/10/understanding-incremental-decoding-in.html 
https://hemingwang.blogspot.com/2019/04/convs2s.html

-----

◎ Transformer

-----

http://hemingwang.blogspot.com/2019/10/the-illustrated-transformer.html 
http://hemingwang.blogspot.com/2019/01/transformer.html 

-----

Part III:Fundamental Topics

-----

◎ Optimization

-----

Optimization
http://hemingwang.blogspot.com/2019/10/an-overview-of-gradient-descent.html
https://hemingwang.blogspot.com/2019/01/optimization.html

SGD
http://hemingwang.blogspot.com/2019/12/sgd.html 

Momentum
https://hemingwang.blogspot.com/2019/12/momentum.html 

NAG
https://hemingwang.blogspot.com/2019/12/nag.html

AdaGrad
https://hemingwang.blogspot.com/2019/12/adagrad.html

AdaDelta
https://hemingwang.blogspot.com/2019/12/adadelta.html

RMSProp
https://hemingwang.blogspot.com/2019/12/rmsprop.html

Adam
https://hemingwang.blogspot.com/2019/12/adam.html

Nadam
https://hemingwang.blogspot.com/2019/12/nadam.html

AMSGrad
https://hemingwang.blogspot.com/2019/12/amsgrad.html

CLR
https://hemingwang.blogspot.com/2019/12/clr.html

SGDR
https://hemingwang.blogspot.com/2019/12/sgdr.html

AdamW
https://hemingwang.blogspot.com/2019/12/adamw.html

Super-Convergence
https://hemingwang.blogspot.com/2019/12/super-convergence.html

Lookahead
https://hemingwang.blogspot.com/2019/12/lookahead.html

RAdam
https://hemingwang.blogspot.com/2019/12/radam.html

ADMM
https://hemingwang.blogspot.com/2019/12/admm.html

ADMM-S
https://hemingwang.blogspot.com/2019/12/admm-s.html

dlADMM
https://hemingwang.blogspot.com/2019/12/dladmm.html

-----

◎ Regularization

-----

Regularization
https://hemingwang.blogspot.com/2019/10/an-overview-of-regularization.html
https://hemingwang.blogspot.com/2019/10/regularization.html

Weight Decay

Dropout

-----

◎ Normalization

-----

Normalization
http://hemingwang.blogspot.com/2019/10/an-overview-of-normalization-methods-in.html
https://hemingwang.blogspot.com/2019/10/normalization.html

BN
WN
LN
IN
GN
PN

-----

◎ Loss Function

-----

Loss Function
https://hemingwang.blogspot.com/2019/10/a-brief-overview-of-loss-functions-in.html
http://hemingwang.blogspot.com/2019/05/loss-function.html

-----

◎ Activation Function

-----

Activation Function
https://hemingwang.blogspot.com/2019/10/understanding-activation-functions-in.html

-----

◎ Convolution

-----
 
Convolution
https://hemingwang.blogspot.com/2019/11/convolution.html

-----

◎ Auto Differentiation

-----

自動微分

-----

◎ Back Propagation

-----

反向傳播

-----

◎ Computational Graph

-----

計算圖


-----

◎ Medicine

-----