Fig. 1.1. AlexNet [1].

Fig. 1.2. Modification of AlexNet [2].

Fig. 1.3. LeNet [3].

Fig. 1.4. Architectural changes of AlexNet [2].

-----

Q2: GPU

當時 AlexNet 率先使用 GPU 並獲得極大的成功。GPU 使用兩塊，而線條與顏色的 C1 filters 每次都會落在不同的 GPU 上 [7]，是值得思考的問題。

-----

Q3: Augmentation

資料擴增有助於降低 overfitting，也就是提升測試時的準確性。由於在 CPU 而不是 GPU 上做，所以對於整體計算時間，可說沒有影響。

第一個方法是 horizontal reflections，也就是左右翻轉。
第二個方法是 altering the intensities of the RGB channels，使用 PCA 改變一點顏色。此處原文並未引用論文，雖然影響不小，對 Top-1 有1%，這邊我們就帶過。

-----

Q4: LRN

Local response normalization (LRN)，局部反應標準化。

這點值得一提。

圖2.1的公式有點複雜，所以我們可以先看看它的靈感來源，從最早的圖2.3先看好了。這是一個 V1-like 的模型，也就是它模仿視覺皮層的第一層的反應。主要概念是減去平均值然後再標準化。圖2.3跟圖2.2都是處理對比，圖2.1由於沒有減去平均值，所以他說他是在處理亮度。裡面的參數是作者調出來的。

Fig. 2.1. Local response normalization, p. 4 [1].

Fig. 2.2. Local contrast normalization, p. 3 [11].

Fig. 2.3. Local input divisive normalization, p. 5 [12].

-----

Q5: ReLU

ReLU 是激活函數的一種，近年來比較多人使用。細節可以參考 [13]。

-----

Q6: Pooling

一般池化區不重疊 [14]。本論文使用重疊的方法，Top-1 錯誤率降低 0.4%，Top-5 降低 0.3%。

-----

Q7: Dropout

很快就講到 dropout 了。Dropout 是很簡單的概念，用在全連接層上。

參考圖3.1，在訓練時丟棄每層上面一些節點，每次丟棄的點是隨機選出，這樣等於每次是在不同的網路上訓練，可以有效避免 overfitting。

圖3.2講的是如果某個節點在訓練時出現的機率是p，由於測試時全部節點都保留，所以權重w要乘上p。

圖3.3是公式，這邊要先講解圖3.5比較清楚。

圖3.5左方是傳統的神經網路，如果我們先隨機產生一個值，1的機率是p，0的機率是(1-p)，然後把值乘上權重再加總，這等於說有乘1被保留，乘0被丟棄，這樣就同時說明了圖3.3的公式與圖3.4的 Bernoulli distribution。

[16] 有深入的說明，[17], [18] 則是簡易的解說。Dropconnect [17] 則是 dropout 的特殊型 [16]，雖然 [17] 聲稱它是一般化，我支持 [16] 的論點。

圖3.6是幾種激活函數搭配 drop 的比較，有意思的是 tanh 搭配 dropout 之後反而變差了！

Fig. 3.1. Dropout neural net model, p. 1930 [15].

Fig. 3.2. A unit at training time and at test time, p. 1931 [15].

Fig. 3.3. Formulae of a standard and dropout network, p. 1933 [15].

Fig. 3.4. Bernoulli distribution, p. 62 [16].

Fig. 3.5. Standard and dropout network, p. 1934 [15].

Fig. 3.6. Drop and activation function p. 5, [20].

-----

結論：

AlexNet 後續的 ZFNet、VGGNet 等都繼承了它的架構，即使是另一個路線的 GoogLeNet，也使用了它的技巧如 dropout 等 [8]，可見 AlexNet 的成功！

-----

References

[1] 2012_Imagenet classification with deep convolutional neural networks

[2] 2014_Visualizing and understanding convolutional networks

[3] 1998_Gradient-Based Learning Applied to Document Recognition

[4] 2016_Mastering the game of Go with deep neural networks and tree search

[5] AI從頭學（一二）：LeNet
http://hemingwang.blogspot.tw/2017/03/ailenet.html

[6] AI從頭學（一三）：LeNet - F6
http://hemingwang.blogspot.tw/2017/03/ailenet-f6.html

[7] AI從頭學（二五）：Kernel Visualizing
http://hemingwang.blogspot.tw/2017/05/aikernel-visualizing.html

[8] AI從頭學（一八）：Convolutional Neural Network
http://hemingwang.blogspot.tw/2017/03/aiconvolutional-neural-network_23.html

[9] AI從頭學（一一）：A Glance at Deep Learning
http://hemingwang.blogspot.tw/2017/02/aia-glance-at-deep-learning.html

[10] AI從頭學（二六）：Aja Huang
http://hemingwang.blogspot.tw/2017/05/aiaja-huang.html

[11] 2009_What is the best multi-stage architecture for object recognition

[12] 2008_Why is real-world visual object recognition hard

[13] mAiLab_0005：Activation Function
http://hemingwang.blogspot.tw/2017/05/mailab0005activation-function.html

[14] mAiLab_0006：Pooling
http://hemingwang.blogspot.tw/2017/05/mailab0006pooling.html

[15] 2014_Dropout, a simple way to prevent neural networks from overfitting

[16] 2016_Deep Learning

[17] 2013_Regularization of neural networks using dropconnect

[18] 2013_Maxout networks

The Star Also Rises

Wednesday, May 31, 2017

AI 從頭學（二六）：AlexNet

No comments:

Programmer

Blog Archive

Labels

Recent Comments

My Blog List

MY LINKS

status

About Me