The Star Also Rises: November 2021

Thursday, November 25, 2021

高雄小旅行（二０）：西子灣

2021/11/25

立冬之後是小雪。

昨天晚上下了一點小雨。

醒的很早，打坐完，吃早餐。決定出門，臨時決定是西子灣。

-----

東南北西。

東邊，算是潮州吧。

南邊，到東港。

北邊，快到樹德科大。不過還是寫澄清湖。

西邊，這次當然是西子灣。

-----

穿過市區，並不算是很好的路線。

上班上學，空氣混濁。

-----

西子灣，其實也可以說中山大學。在裡面騎的體驗還不錯。

往柴山的方向，騎了一小段。

還看到錦蛇。

問了一下說明錦蛇的路人。往柴山路不通，是軍事管制區。

差不多就是原路騎回。七點到九點，花了兩個小時。

本來其實沒有打算出門，不過工作有個進度，還是維持出門運動的習慣比較好。

果然感覺不錯。

-----

高雄小旅行

2021/10/07

-----

2021/03/27

-----

高雄小旅行（二０）：西子灣

2021/11/25 Thu

https://hemingwang.blogspot.com/2021/11/blog-post_25.html

高雄小旅行（一九）：紫雲寺

2021/11/18 Thu

https://hemingwang.blogspot.com/2021/11/blog-post_18.html

高雄小旅行（一八）：澄清湖

2021/11/12 Fri

https://hemingwang.blogspot.com/2021/11/blog-post_29.html

高雄小旅行（一七）：長治

2021/11/06 Sat

https://hemingwang.blogspot.com/2021/11/blog-post_51.html

高雄小旅行（一六）：大樹

2021/11/04 Thu

https://hemingwang.blogspot.com/2021/11/blog-post_5.html

高雄小旅行（一五）：東港

2021/10/28 Thu

http://hemingwang.blogspot.com/2021/10/blog-post_4.html

高雄小旅行（一四）：鳳邑麵線

2021/10/23 Sat

https://hemingwang.blogspot.com/2021/10/blog-post_23.html

高雄小旅行（一三）：大寮 - 北 - 高 71 - 捷運大寮站

2021/10/21 Thu

https://hemingwang.blogspot.com/2021/10/71.html

高雄小旅行（一二）：鳳山大潤發

2021/10/16 Sat

https://hemingwang.blogspot.com/2021/10/blog-post_93.html

高雄小旅行（一一）：大寮 - 南 - 光明路

2021/10/15 Fri

https://hemingwang.blogspot.com/2021/10/blog-post_8.html

高雄小旅行（一０）：大寮 - 北 - 光明路

2021/10/06 Wed

https://hemingwang.blogspot.com/2021/10/blog-post_6.html

高雄小旅行（九）：潮州 - 來義

2021/10/02 Sat

https://hemingwang.blogspot.com/2021/10/blog-post.html

高雄小旅行（八）：大寮 - 南 - 台 25

2021/09/29 Wed

https://hemingwang.blogspot.com/2021/09/blog-post_27.html

高雄小旅行（七）：大寮 - 北 - 台25

2021/09/24 Fri

https://hemingwang.blogspot.com/2021/09/25.html

高雄小旅行（六）：萬丹 - 竹田

2021/09/20 Mon

https://hemingwang.blogspot.com/2021/09/blog-post_9.html

高雄小旅行（五）：萬丹紅豆餅（黃）

2021/09/18 Sat

https://hemingwang.blogspot.com/2021/09/blog-post_18.html

高雄小旅行（四）：萬丹 - 社皮

2021/09/17 Fri

https://hemingwang.blogspot.com/2021/09/blog-post_17.html

高雄小旅行（三）：萬丹

2021/09/08 Wed

https://hemingwang.blogspot.com/2021/09/blog-post_11.html

高雄小旅行（二）：六度素食

2021/09/04 Sat

https://hemingwang.blogspot.com/2021/09/blog-post.html

高雄小旅行（一）：88 - 高屏溪

2021/08/29 Sun

https://hemingwang.blogspot.com/2021/08/88.html

-----

Sunday, November 21, 2021

喜寶

2021/11/21

喜寶是亦舒的代表作，名列世紀百強第 91。

-----

由於在世紀百強之內，又是香港人的作品，又是倪匡的妹妹，所以之前有興趣看一下。從好讀下載後，看一點，就覺得沒有很適合，就停了。

沒想到，昨天晚上，一口氣看完。

-----

回老家後，發現有「藍鳥記」跟「散髮」，都是短篇小說，都很明快流暢。雖然我不知道這兩本是哪裡跑出來的。

於是又過了一陣子，先後讀了好讀上面的四本長篇小說。

紫微願。

我的前半生。

朝花夕拾。

喜寶。

-----

「紫微願」是女強人為愛希望把年齡降低。剛好遇到外星人。降低後發現沒有那麼好用，再拜託外星人調回去。雖然有外星人，還是一個戀愛故事。帶一點人生哲理。

「我的前半生」，中年的醫生太太被離婚。之後在社會上奮鬥，又重新覓得佳偶，是一個童話。有點寫實又不太真實。

「朝花夕拾」，無意中穿越時空照顧媽媽。也有一場求不得的戀愛。衛斯理有出現。

最後是「喜寶」。一個出賣靈魂或者出賣肉體的劍橋女大學生，最後愛上了買她的富翁，最後被毀了，最後其實沒有被毀。

有種浮士德的況味。

-----

「愛」、「金錢」、「健康」，何者重要？

或者，為金錢放棄愛，值得嗎？

或者，為金錢放棄人生，值得嗎？

很辛酸的主題，最後還是有一點被寫成喜劇。

假定喜寶當初沒有賣給富翁，那她如何取得學費繼續讀書，我想這是另一個故事了。

-----

ELMo（四）：Appendix

2021/11/18

-----

https://pixabay.com/zh/photos/child-girl-face-bath-wash-foam-645451/

Modified from # BERT。

# ELMo

-----

# ELMo

-----

NER Deep Learning

# NER

-----

References

# ELMo。被引用 5229 次。ELMo 是 Context2vec 中，做的最好的。

Peters, Matthew E., et al. "Deep contextualized word representations." arXiv preprint arXiv:1802.05365 (2018).

https://arxiv.org/pdf/1802.05365.pdf

# BERT。被引用 12556 次。

Devlin, Jacob, et al. "Bert: Pre-training of deep bidirectional transformers for language understanding." arXiv preprint arXiv:1810.04805 (2018).

https://arxiv.org/pdf/1810.04805.pdf

# NER

Yadav, Vikas, and Steven Bethard. "A survey on recent advances in named entity recognition from deep learning models." arXiv preprint arXiv:1910.11470 (2019).

https://arxiv.org/pdf/1910.11470.pdf

-----

ELMo（三）：Illustrated

2021/09/01

-----

https://pixabay.com/zh/photos/beard-the-old-man-turban-india-2268096/

-----

# Outline

-----

# Tasks

-----

說明：

POS、CHUNK、NER、SRL，都可使用 Word Embedding 後，以 LSTM 進行 Supervised 的訓練，來完成。

BIO 與 BIOES。

「B，即 Begin，表示開始。I，即 Intermediate，表示中間。E，即 End，表示結尾。S，即Single，表示單個字符。O，即 Other，表示其他，用於標記無關字符。」

「將“小明在北京大學的燕園看了中國男籃的一場比賽”這句話，進行標註，結果就是：

[B-PER，E-PER，O, B-ORG，I-ORG，I-ORG，E-ORG，O，B-LOC，E-LOC，O，O，B-ORG，I-ORG，I-ORG，E-ORG，O，O，O，O]」

https://zhuanlan.zhihu.com/p/88544122

-----

SRL

-----

Coref

共指消解（Coreference Resolution or Reference Resolution）

找出代名詞相關單字所對應的真實世界的事物。例如，he、his 與 Obama。

https://zhuanlan.zhihu.com/p/53550123

-----

SNLI

Stanford Natural Language Inference

entailment：蘊含、推理。

contradiction：矛盾、對立。

neutral：中立、無關。

https://blog.eson.org/pub/a1c27ad7/

-----

SQuAD

https://zhuanlan.zhihu.com/p/137828922

-----

SST-5

「The Stanford Sentiment Treebank is a corpus with fully labeled parse trees that allows for a complete analysis of the compositional effects of sentiment in language. The corpus is based on the dataset introduced by Pang and Lee (2005) and consists of 11,855 single sentences extracted from movie reviews. It was parsed with the Stanford parser and includes a total of 215,154 unique phrases from those parse trees, each annotated by 3 human judges.」

斯坦福情感樹庫是一個帶有完全標記的解析樹的語料庫，可以對語言中情感的構成效應進行完整的分析。該語料庫基於 Pang 和 Lee (2005) 引入的數據集，包含從電影評論中提取的 11,855 個單句。它使用斯坦福解析器進行解析，包括來自這些解析樹的總共 215,154 個獨特的短語，每個短語由3 位人類判斷進行註釋。

「Each phrase is labelled as either negative, somewhat negative, neutral, somewhat positive or positive. The corpus with all 5 labels is referred to as SST-5 or SST fine-grained. Binary classification experiments on full sentences (negative or somewhat negative vs somewhat positive or positive with neutral sentences discarded) refer to the dataset as SST-2 or SST binary.」

每個短語都被標記為消極、有點消極、中性、有點積極或積極。具有所有 5 個標籤的語料庫被稱為 SST-5 或 SST 細粒度。完整句子的二元分類實驗（否定或有點否定 vs 有點肯定或肯定，丟棄中性句子）將數據集稱為 SST-2 或 SST 二進制。

https://paperswithcode.com/dataset/sst

https://blog.csdn.net/xxr233/article/details/115456578

-----

Modified from # BERT。

說明：

預訓練可以輸出三層的上下文向量。

-----

Figure 2: Visualization of softmax normalized biLM layer weights across tasks and ELMo locations. Normalized weights less then 1/3 are hatched with horizontal lines and those greater then 2/3 are speckled.

圖 2：跨任務和 ELMo 位置的 softmax 歸一化 biLM 層權重的可視化。小於 1/3 的歸一化權重用水平線陰影，大於 2/3 的那些有斑點。

# ELMo

說明：

「Visualization of learned weights

Figure 2 visualizes the softmax-normalized learned layer weights. At the input layer, the task model favors the first biLSTM layer. For coreference and SQuAD, the this is strongly favored, but the distribution is less peaked for the other tasks. The output layer weights are relatively balanced, with a slight preference for the lower layers.」

學習權重的可視化

圖 2 可視化了經過 softmax 歸一化的學習層權重。在輸入層，任務模型偏向於第一個 biLSTM 層。對於共指和 SQuAD，這是非常受歡迎的，但其他任務的分佈不那麼高。輸出層權重相對均衡，對較低層略有偏愛。

https://www.groundai.com/project/deep-contextualized-word-representations/1

-----

說明：

「對於 GloVe ，多義詞比如 play，根據它的 embedding 找出的最接近的其它單詞大多數集中在體育，這很明顯是因為訓練數據中包含 play 句子中體育領域的數量明顯較多。」

「使用 ELMo，根據上下文動態調整後的 embedding 不僅能夠找出對應的「演出」的相同語義的句子，而且還可以保證找出的句子中的 play 對應的詞性也是相同的。之所以會這樣，是因為，第一層 LSTM 編碼了很多句法信息。」

https://blog.csdn.net/qq_35883464/article/details/100173045

-----

References

# ELMo。被引用 5229 次。ELMo 是 Context2vec 中，做的最好的。

Peters, Matthew E., et al. "Deep contextualized word representations." arXiv preprint arXiv:1802.05365 (2018).

https://arxiv.org/pdf/1802.05365.pdf

# BERT。被引用 12556 次。

Devlin, Jacob, et al. "Bert: Pre-training of deep bidirectional transformers for language understanding." arXiv preprint arXiv:1810.04805 (2018).

https://arxiv.org/pdf/1810.04805.pdf

# NER

Yadav, Vikas, and Steven Bethard. "A survey on recent advances in named entity recognition from deep learning models." arXiv preprint arXiv:1910.11470 (2019).

https://arxiv.org/pdf/1910.11470.pdf

# SRL

Tan, Zhixing, et al. "Deep semantic role labeling with self-attention." Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 32. No. 1. 2018.

https://ojs.aaai.org/index.php/AAAI/article/download/11928/11787

# Coref

Lee, Kenton, et al. "End-to-end neural coreference resolution." arXiv preprint arXiv:1707.07045 (2017).

https://arxiv.org/pdf/1707.07045.pdf

# SNLI

Camburu, Oana-Maria, et al. "e-snli: Natural language inference with natural language explanations." arXiv preprint arXiv:1812.01193 (2018).

https://arxiv.org/pdf/1812.01193.pdf

# SQuAD

Rajpurkar, Pranav, Robin Jia, and Percy Liang. "Know what you don't know: Unanswerable questions for SQuAD." arXiv preprint arXiv:1806.03822 (2018).

https://arxiv.org/pdf/1806.03822.pdf

-----

The Star Also Rises: ELMo

https://hemingwang.blogspot.com/2019/04/elmo.html

-----

ELMo（二）：Overview

2020/12/28

-----

https://pixabay.com/zh/photos/elmo-pirate-toy-kids-sailor-2078481/

-----

◎ Abstract

-----

◎ Introduction

-----

本論文要解決（它之前研究）的（哪些）問題（弱點）？

-----

# Paragraph2vec。

說明：

句向量。

-----

◎ Method

-----

解決方法？

-----

Modified from # BERT。

說明：

預訓練可以輸出三層的上下文向量。

-----

具體細節？

說明：

三層拼接，基於上下文的向量。

-----

◎ Result

-----

本論文成果。

-----

◎ Discussion

-----

本論文與其他論文（成果或方法）的比較。

-----

成果比較。

-----

方法比較。

-----

◎ Conclusion

-----

◎ Future Work

-----

後續相關領域的研究。

-----

後續延伸領域的研究。

-----

◎ References

-----

# Paragraph2vec。被引用 6763 次。

Le, Quoc, and Tomas Mikolov. "Distributed representations of sentences and documents." International conference on machine learning. 2014.

http://proceedings.mlr.press/v32/le14.pdf

# Context2vec。被引用 312 次。

Melamud, Oren, Jacob Goldberger, and Ido Dagan. "context2vec: Learning generic context embedding with bidirectional lstm." Proceedings of the 20th SIGNLL conference on computational natural language learning. 2016.

https://www.aclweb.org/anthology/K16-1006.pdf

CoVe

ELLM

# ELMo。被引用 5229 次。ELMo 是 Context2vec 中，做的最好的。

Peters, Matthew E., et al. "Deep contextualized word representations." arXiv preprint arXiv:1802.05365 (2018).

https://arxiv.org/pdf/1802.05365.pdf

# BERT。被引用 12556 次。

Devlin, Jacob, et al. "Bert: Pre-training of deep bidirectional transformers for language understanding." arXiv preprint arXiv:1810.04805 (2018).

https://arxiv.org/pdf/1810.04805.pdf

-----

The Star Also Rises: ELMo

https://hemingwang.blogspot.com/2019/04/elmo.html

-----

ELMo

ELMo

2019/04/08

-----

Fig. ELMo（圖片來源）。

-----

# BERT

-----

// Learn how to build powerful contextual word embeddings with ELMo

-----

// NLP自然语言处理：文本表示总结 - 下篇（ELMo、Transformer、GPT、BERT）_陈宸的博客-CSDN博客

-----

// NLP自然语言处理：文本表示总结 - 下篇（ELMo、Transformer、GPT、BERT）_陈宸的博客-CSDN博客

-----
References

ELMo Contextual Language Embedding
https://www.kdnuggets.com/2019/01/elmo-contextual-language-embedding.html

Learn how to build powerful contextual word embeddings with ELMo
https://medium.com/saarthi-ai/elmo-for-contextual-word-embedding-for-text-classification-24c9693b0045

What is ELMo | ELMo For text Classification in Python

https://www.analyticsvidhya.com/blog/2019/03/learn-to-use-elmo-to-extract-features-from-text/

// NLP自然语言处理：文本表示总结 - 下篇（ELMo、Transformer、GPT、BERT）_陈宸的博客-CSDN博客
https://blog.csdn.net/qq_35883464/article/details/100173045

Thursday, November 18, 2021

高雄小旅行（一九）：紫雲寺

2021/11/18

法鼓山紫雲寺，在澄清湖旁。

-----

上次環湖，本來就應該會路過紫雲寺，但是沒有記住地址。

這次有記得，是忠孝路。

昨天騎機車經過青年一路，滷肉飯也是忠孝路，所以有寄起來。

沿著青年路，快到澄清湖時，還是迷了路，但是紫雲寺終究是有找到。

-----

神農路，可以到大寮，光明路。

車也不少，但沒有一般公路的感覺，也許是路樹高大，路彎曲。

左手邊就是鳥松的山區。

-----

光明路到台 25 間的省道台一線，車很多。

這次改騎平行的巷道。

-----

南台春捲，朋友介紹。

之前有看到，後來再找，都找不到。原來它是在右手邊，不是在左手邊。

五甲一路 10 號。

春捲跟浮水魚羹，都還不錯。

-----

其實一早本來沒有想出去，但是還是決定出門。出太陽，還不錯。

-----

Friday, November 12, 2021

高雄小旅行（一八）：澄清湖

2021/11/12

-----

環湖不錯。

往高雄市區看，可以看到空氣是霧濛濛的。

-----

Thursday, November 11, 2021

植物

2021/11/11

植物的生命力，非常旺盛。

-----

老家一樓是車庫，車庫上方是一個露台。露台上種了一些植物。

萬年青是主力，類似萬年青的，是疑似金錢樹。有細長會開出紫色的花，不是牽牛。合果芋跟花葉萬年青，比較容易查出來。黃金葛，不難。比較特別的，是九重葛。

第一棵九重葛，是隔壁媽媽送我的。她說我家陽台都是樹，沒有花，比較單調。巷子尾端的透天厝，家家戶戶都有種花，除了我家。於是她要送了我一小盆九重葛，我沒接受。隔了一段時間，我心念一轉，跟她要了一盆。

一開始我小心翼翼，找不出好的擺放位置，也不知道要澆多少水。開花後開始落葉，我生怕養不活，後來發現這是正常現象。這盆九重葛是她們家種的分株出來，不是買的，所以我也就比較不怕養死不好意思。

第二棵九重葛，是自己家裡的。老根本來已經死了。梅雨時連日雨晴，竟然冒出嫩枝，長的還不錯，前一陣子也冒出一朵粉紅色的小花。第三棵九重葛，是從第一棵剪下的突長枝。隨便插就活了。

九重葛之外，黃金葛我自己也種了兩棵，是從原來的藤蔓剪下的。沿著露台的欄杆，越爬越高，葉子越長越大。

雀榕，我也種了兩棵。雀榕的生命力超強。一棵是從門口的枝上剪下，一棵是從二樓的牆壁剪下。本來要丟了，但是已有小小規模，就找兩個盆子插上，嫩葉不久即冒出。

比較可惜的，是芒果樹。夏天時買的芒果，有一顆放比較久，吃的時候已經快發芽了，於是我把它種下。不久冒出兩片可愛的葉子。但是芒果耐旱不耐濕，我原先不知道，就跟九重葛與黃金葛澆一樣的水，結果新葉遲遲不再長。我也不確定土質是不是太黏稠，於是趕快換了較透水的土壤，順便換大盆以便未來長大。但是終究沒有救活，逐漸凋萎。可惜了。

照顧植物比照顧動物簡單許多。低限就是澆水。修剪，也是必要的。施肥倒是不一定要，除非你希望它長的較好。

我從植物身上學到不少東西。

-----

BN

BN

2019/12/17

-----

// [ML筆記] Batch Normalization

-----

// An Intuitive Explanation of Why Batch Normalization Really Works (Normalization in Deep Learning Part 1) _ Machine Learning Explained

-----

References

# Batch Normalization

Ioffe, Sergey, and Christian Szegedy. "Batch normalization: Accelerating deep network training by reducing internal covariate shift." International conference on machine learning. 2015.

http://proceedings.mlr.press/v37/ioffe15.pdf

Notes for 'Batch Normalization Accelerating Deep Network Training by Reducing Internal Covariate Shift' paper · GitHub
https://gist.github.com/shagunsodhani/4441216a298df0fe6ab0

An Intuitive Explanation of Why Batch Normalization Really Works (Normalization in Deep Learning Part 1) _ Machine Learning Explained
https://mlexplained.com/2018/01/10/an-intuitive-explanation-of-why-batch-normalization-really-works-normalization-in-deep-learning-part-1/

Batch Normalization原理与实战 - 知乎
https://zhuanlan.zhihu.com/p/34879333

Batch Normalization导读 - 知乎
https://zhuanlan.zhihu.com/p/38176412

[ML筆記] Batch Normalization
http://violin-tao.blogspot.com/2018/02/ml-batch-normalization.html

Tuesday, November 09, 2021

鼠輩

2021/11/09

-----

星期天幸運捉到家裡的老鼠一隻。放生後貼文發到臉書，有臉友說，如果窩在家裡，可能會回來。最初我以為，什麼東西「窩」在家裡，看不懂他的意思，是說我窩在家裡不出門嗎？搞了半天原來是老鼠的「窩」在家裡的話，老鼠會聞味道，回我家。

有沒有可能呢？我放生老鼠的地方在一公里遠的大型空地，假設真的跑回來，要再捉住可能沒那麼容易了。但是如果再捉到，還是會放生吧。

這有點鴕鳥心態，除非放生到無人小島，要不然還是會困擾別人。但即使是老鼠，也有活命的權利吧。

-----

另外就是殺老鼠也不會解決問題的，這隻不在，還有那隻。這隻沒來之前，為何這隻會來呢？這才是重點。食物的部分，已經都收好在櫃子裡了。水的部分，明顯的飲用水都收起來改裝在水壺裡面了。肥皂要是再被咬過，再買密封的肥皂盒就是了。另外就是雜物持續清理，讓老鼠沒有適合作窩的地方。廚房的水管，則是先前已經改成不鏽鋼軟管了。

如果這樣老鼠還是想住在我家，在外面覓食，那我也佩服，再添購捕鼠籠就是。至於捕鼠籠如果捉住老鼠，那要在哪裡放生呢？柴山可能比較適合，那裡的生態系統比較豐富，老鼠可能會成為其他物種的食物，老鼠也可能找的到食物。最重要的一點是，柴山很遠，除非坐捷運，要不然可能回不來。至於老鼠如何搭捷運，請自行腦補吧！

又，其實星期一晚上就聽到樓下有怪聲了，只是不知是老鼠回來，還是另一隻老鼠就是了！

-----

Monday, November 08, 2021

紫微願

2021/11/08

先前看過亦舒的藍鳥記與散髮。

紫微願是長篇。不能說是愛情小說，也不能說是科幻小說，雖然提到了衛斯理。

比較像是哲學？

人希望有年輕的身體。但年輕的身體跟成熟的思想未必能搭。

不過也沒有真的很認真討論這個問題，畢竟是娛樂用的通俗小說。

寫的不能說好，但還是流暢，所以一個晚上好讀五百頁一下就看完了。

能思考的地方還是很多，不過就這樣吧。

其實可惜了一個還不錯的主題。

-----

火

2021/11/09

冬天來了！

冬天到來之時，能夠在溫暖的家鄉度過，真的是太好了。

-----

碩鼠

2021/11/08

昨天運氣很好，把家中的老鼠送走了！

-----

這一陣子，廚房的排水管經常被咬破，本來以為是蟑螂，但偶而有些奇怪的跡象，譬如有東西被打翻，最新的甚至肥皂有齒痕。

昨天晚上看動畫的時候，一隻小動物出現在三樓臥室，果然，老鼠現身。

先把臥室門關上，用重紙箱跟一些填充物把門縫堵起。然後開始研究捕鼠器。有好幾種。

把床移動一下，老鼠要跑出去，但門縫已經塞死，於是老鼠跑到紙箱旁，我用紙箱壓住老鼠，然後再用舊的運動外套把老鼠逮住，再用塑膠袋把運動外套包起來，老鼠終於手到擒來。

如何處理呢？放生。到哪裡放生？紫竹林精舍，讓它聽聽佛法？老鼠跑進精舍就困擾大家了。最後送它到鐵路機廠旁的草地，附近最接近「野外」的地方。

順利送走老鼠，菩薩保佑！

-----

高雄小旅行（一七）：長治

2021/11/08

週六下午跑了一趟長治

-----

法鼓山的果仁師兄回長治老家，邀我去拜訪。

這個禮拜事情還蠻忙的，還好有些該做的，很早之前就開始準備了，終於這禮拜順利度過。週一照常跑步，週四早抽空到大樹騎車。

原本已經回絕了邀請，但週六早上要辦的事沒辦法辦，午餐過後，想一想還是決定出門。跟果仁師兄確定後，就騎車前往長治。

會答應這個邀請還有幾個原因，他上次來拜訪我，這次算是回禮。另外屏東我已經跑了好幾次，長治本來也在名單，沒這麼快就是了。

鳳山到長治，不難走，台一轉台三轉台廿四就可以了，路標也很清楚，可以在長治時轉進長治市區，繞了一陣子，靠著導航跟問路，才回到正途。

繁華其實就在台廿四旁。果仁師兄老家「農舍」已經翻新，是很漂亮的透天厝。我們聊了不少人生經驗，但細節就不說了。享用了「鄉下人」（我自己也是鄉下人出身）殷勤的晚餐招待後，我們在戶外又聊了半小時，這才離開長治。

回程比較好騎，但還是稍微問了一下路，比較保險。因為夜色昏暗。

回程路過高速公路涵洞，順便把廢棄已久的電視機處理了一半。很多事，是因緣，很多事，是決心。

-----

Sunday, November 07, 2021

ConvS2S（四）：Appendix

2021/10/27

-----

-----

Outline

1. ConvS2S
2. I/O Embedding
3. Padding and Mask
4. GLU
5. Encoder
6. Decoder
7. Multi-Hop Attention
8. Details

-----

ConvS2S

Figure 1. Illustration of batching during training. The English source sentence is encoded (top) and we compute all attention values for the four German target words (center) simultaneously. Our attentions are just dot products between decoder context representations (bottom left) and encoder representations. We add the conditional inputs computed by the attention (center right) to the decoder states which then predict the target words (bottom right). The sigmoid and multiplicative boxes illustrate Gated Linear Units.

圖 1. 訓練期間批處理的圖示。英語源句子被編碼（頂部），我們同時計算四個德語目標詞（中心）的所有注意力值。我們的注意力只是解碼器上下文表示（左下）和編碼器表示之間的點積。我們將注意力（右中）計算的條件輸入添加到解碼器狀態，然後預測目標詞（右下）。 sigmoid 和乘法框說明了門控線性單元。

# ConvS2S。

-----

I/O Embedding

one hot encoding 壓縮成 768。

768 轉成 512 或 1024 或 2048。

https://zhuanlan.zhihu.com/p/27234078

https://www.telesens.co/2019/04/21/understanding-incremental-decoding-in-fairseq/

-----

Modified from # ConvS2S

Padding 的部分，由於使用一維的 Conv3，Encoder 端的 They agree，除了先加上句末的標記 </s>，另外還要補上兩個 <p>。Decoder 端的德文 Sie stimmen zu，除了句首的標記 <s>，也要補上兩個 <p>。Attention 的矩陣可以看到 <s> Sie stimmen tu 與 They agree </s> 的權重對應。

Mask 的部分，可以看到 Decoder 端跟 Encoder 端藍色部分形狀不同。Decoder 端在預測下一個字的時候，不會用到未來的訊息。

Modified from # ConvS2S

-----

GLU

# GLU。

說明：

句子的向量，先分別經過兩個一維卷積的運算。上半部經過 sigmoid 的運算，讓它具有 LSTM 的門的作用。然後再與另一半點乘。

-----

# ConvS2S

-----

# ConvS2S

說明：

ConvS2S 的 GLU 會加上殘差連結。

https://zhuanlan.zhihu.com/p/27464080

-----

# ConvS2S

-----

Encoder

Modified from # ConvS2S

說明：

Encoder 最後一層的輸出，作為 Decoder 每一層（共八層） attention 的參考來源。

https://reniew.github.io/44/

-----

Decoder

Modified from # ConvS2S

說明：

會用到 Attention 的權重來決定 Encoder 端每個字的份量。

https://reniew.github.io/44/

Encoder 一次讀完，Decoder 逐字輸出。每次吐出一個字。

https://ycts.github.io/weeklypapers/convSeq2seq/

-----

Modified from # ConvS2S

-----

QKV

Modified from # ConvS2S

C = QKV

last layer：encoder output 的最後一層。

每吐完四個字，hidden state 重新作為下一層的輸入。

https://deeplearning.hatenablog.com/entry/convs2s

-----

# Short Attention。

說明：

Yt = [h(t-L) ... h(t-1)]。previous L。

1 是每個值都為 1 的向量，維度為 L。

T 是轉置。

（1）計算隱藏層的輸出。

（2）計算權重。

（3）Yt 與權重得到上下文向量。

（4）上下文向量跟 ht 得到 ht*。

（5）將 ht* 轉成 yt，也就是 |v| 個字應該輸出哪一個字的機率分布。

-----

# Short Attention。

說明：

（6）ht 的維度為 2k，kt、vt 的維度皆為 k。

（7）k 分量代替 Yt。

（8）此公式不變。

（9）v 分量代替 Yt。

（10）vt 代替 ht。

-----

# Short Attention。

說明：

（11）ht 的維度為 3k，kt、vt、pt 的維度各為 k。

（12）ht*、rt、pt 的維度都是 k。

-----

Modified from # ConvS2S

-----

Multi-hop attention

# ConvS2S

說明：

八層 Attention。

一、三、六：線性。

二、八：整個來源句的資訊。

四：名詞。

五、七：建立。配合德文（文法結構）的翻譯。

# ConvS2S

byte-pair encoding (BPE)

「BPE 最早由 Philip Gage 提出，用來做數據壓縮上。它的原理是將常見連續的兩個符號以另一個符號表達。例如 ababab 中 a 後面很常接著 b 我們就用 c 表達 ab，並得到一個新的序列 ccc，c = ab 。」

https://theblackcat102.github.io/BPE/

「Layer 1, 3 and 6 exhibit a linear alignment. The first layer shows the clearest alignment, although it is slightly off and frequently attends to the corresponding source word of the previously generated target word. 」

第 1、3 和 6 層呈現線性排列。第一層顯示了最清晰的對齊方式，儘管它略微偏離並且經常關注先前生成的目標詞的相應源詞。

「Layer 2 and 8 lack a clear structure and are presumably collecting information about the whole source sentence. 」

第 2 層和第 8 層缺乏清晰的結構，可能正在收集有關整個源語句的信息。

「The fourth layer shows high alignment scores on nouns such as “festival”, “way” and “work” for both the generated target nouns as well as their preceding words. Note that in German, those preceding words depend on gender and object relationship of the respective noun. 」

第四層顯示了對生成的目標名詞及其前面的詞的“節日”、“方式”和“工作”等名詞的高對齊分數。請注意，在德語中，前面的詞取決於相應名詞的性別和賓語關係。

「Finally, the attention scores in layer 5 and 7 focus on “built”, which is reordered in the German translation and is moved from the beginning to the very end of the sentence. One interpretation for this is that as generation progresses, the model repeatedly tries to perform the re-ordering. “aufgebaut” can be generated after a noun or pronoun only, which is reflected in the higher scores at positions 2, 5, 8, 11 and 13.」

最後，第 5 層和第 7 層的注意力分數集中在“built”上，它在德語翻譯中重新排序，並從句子的開頭移到結尾。對此的一種解釋是，隨著生成的進行，模型會反复嘗試執行重新排序。 “aufgebaut”可以僅在名詞或代詞之後生成，這反映在位置 2、5、8、11 和 13 的較高分數中。

-----

Modified from # ConvS2S。

-----

Modified from # ConvS2S。

-----

Modified from # ConvS2S。

-----

Details

https://zhuanlan.zhihu.com/p/60524073

https://norman3.github.io/papers/docs/fairseq.html

-----

Table 4. Effect of removing position embeddings from our model in terms of validation perplexity (valid PPL) and BLEU.

表 4. 從我們的模型中移除位置嵌入對驗證困惑度（有效 PPL）和 BLEU 的影響。

# ConvS2S

-----

PPL

Perplexity（困惑度）

PPL 越低表示模型越好。

https://towardsdatascience.com/perplexity-intuition-and-derivation-105dd481c8f3

https://blog.csdn.net/blmoistawinde/article/details/104966127

https://medium.com/nlp-tsupei/perplexity%E6%98%AF%E4%BB%80%E9%BA%BC-426f52897513

https://www.youtube.com/watch?v=8s56yyL-EfQ

-----

References

# ConvS2S。被引用 1772 次。

Gehring, Jonas, et al. "Convolutional sequence to sequence learning." arXiv preprint arXiv:1705.03122 (2017).

https://arxiv.org/pdf/1705.03122.pdf

# GLU

Dauphin, Yann N., et al. "Language modeling with gated convolutional networks." Proceedings of the 34th International Conference on Machine Learning-Volume 70. JMLR. org, 2017.

https://arxiv.org/pdf/1612.08083.pdf

-----

The Star Also Rises: NLP（四）：ConvS2S

https://hemingwang.blogspot.com/2019/04/convs2s.html

-----

The Star Also Rises

Thursday, November 25, 2021

高雄小旅行（二０）：西子灣

高雄小旅行

Sunday, November 21, 2021

喜寶

ELMo（四）：Appendix

ELMo（三）：Illustrated

ELMo（二）：Overview

ELMo

Thursday, November 18, 2021

高雄小旅行（一九）：紫雲寺

Friday, November 12, 2021

高雄小旅行（一八）：澄清湖

Thursday, November 11, 2021

植物

BN

Tuesday, November 09, 2021

鼠輩

Monday, November 08, 2021

紫微願

火

碩鼠

高雄小旅行（一七）：長治

Sunday, November 07, 2021

ConvS2S（四）：Appendix

Programmer

Blog Archive

Labels

Recent Comments

My Blog List

MY LINKS

status

About Me