Friday, August 13, 2021

Word2vec

Word2vec

2018/11/26

說明:

Word2vec 主要有三個版本。

v1 包含 CBOW 與 Skip-gram。
https://hemingwang.blogspot.com/2020/07/word2vec-v1.html

v2 包含 Hierarchical Softmax 與 Negative Sampling
https://hemingwang.blogspot.com/2020/07/word2vec-v2.html

v3 則為 v1 與 v2 的簡化版。
https://hemingwang.blogspot.com/2020/08/word2vec-v3.html

-----

References

◎ 論文

# Linguistic Regularity

Mikolov, Tomas, Wen-tau Yih, and Geoffrey Zweig. "Linguistic regularities in continuous space word representations." Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2013.
http://www.aclweb.org/anthology/N13-1090

# Word2vec Explained
Goldberg, Yoav, and Omer Levy. "word2vec Explained: deriving Mikolov et al.'s negative-sampling word-embedding method." arXiv preprint arXiv:1402.3722 (2014).
https://arxiv.org/pdf/1402.3722.pdf

#
Word Embedding for Understanding Natural Language  A Survey _ SpringerLink
https://link.springer.com/chapter/10.1007/978-3-319-53817-4_4

#
来斯惟. "基于神经网络的词和文档语义向量表示方法研究." (2016).
https://arxiv.org/ftp/arxiv/papers/1611/1611.05962.pdf

-----

◎ 英文

The Illustrated Word2vec – Jay Alammar – Visualizing machine learning one concept at a time
http://jalammar.github.io/illustrated-word2vec/

Learning Word Embedding
https://lilianweng.github.io/lil-log/2017/10/15/learning-word-embedding.html

The amazing power of word vectors | the morning paper
https://blog.acolyer.org/2016/04/21/the-amazing-power-of-word-vectors/
 
Intuitive Understanding of Word Embeddings: Count Vectors to Word2Vec
https://www.analyticsvidhya.com/blog/2017/06/word-embeddings-count-word2veec/

Introduction to Word Embedding and Word2Vec – Towards Data Science
https://towardsdatascience.com/introduction-to-word-embedding-and-word2vec-652d0c2060fa
 
Deep Learning Weekly | Demystifying Word2Vec
https://www.deeplearningweekly.com/blog/demystifying-word2vec

Deep Learning, NLP, and Representations - colah's blog
http://colah.github.io/posts/2014-07-NLP-RNNs-Representations/

-----

Word2Vec Tutorial - The Skip-Gram Model · Chris McCormick
http://mccormickml.com/2016/04/19/word2vec-tutorial-the-skip-gram-model/

Learn TensorFlow, the Word2Vec model, and the TSNE algorithm using rock bands
https://medium.freecodecamp.org/learn-tensorflow-the-word2vec-model-and-the-tsne-algorithm-using-rock-bands-97c99b5dcb3a

Vector Representations of Words  |  TensorFlow
https://www.tensorflow.org/tutorials/word2vec

Learn Word2Vec by implementing it in tensorflow – Towards Data Science
https://towardsdatascience.com/learn-word2vec-by-implementing-it-in-tensorflow-45641adaf2ac

Word2Vec word embedding tutorial in Python and TensorFlow - Adventures in Machine Learning
http://adventuresinmachinelearning.com/word2vec-tutorial-tensorflow/

Implementing word2vec in PyTorch (skip-gram model) – Towards Data Science
https://towardsdatascience.com/implementing-word2vec-in-pytorch-skip-gram-model-e6bae040d2fb

Word2Vec in Deeplearning4j _ Deeplearning4j
https://deeplearning4j.org/docs/latest/deeplearning4j-nlp-word2vec

gensim: models.word2vec – Deep learning with word2vec
https://radimrehurek.com/gensim/models/word2vec.html

Approximating the Softmax for Learning Word Embeddings
http://ruder.io/word-embeddings-softmax/

-----

Applying word2vec to Recommenders and Advertising · Chris McCormick
http://mccormickml.com/2018/06/15/applying-word2vec-to-recommenders-and-advertising/

Stop Using word2vec | Stitch Fix Technology – Multithreaded
https://multithreaded.stitchfix.com/blog/2017/10/18/stop-using-word2vec/

-----

簡中

word2vec 中的数学原理详解(一)目录和前言 - peghoty - CSDN博客
https://blog.csdn.net/itplus/article/details/37969519

[NLP] 秒懂词向量Word2vec的本质 - 知乎
https://zhuanlan.zhihu.com/p/26306795

NLP预训练模型大集合! _ 机器之心
https://www.jiqizhixin.com/articles/2018-12-28-5

使用中文wiki语料库训练word2vec - zhyongwei的博客 - CSDN博客
https://blog.csdn.net/zhyongwei/article/details/79597894

-----

繁中

自然語言處理入門- Word2vec小實作 – PyLadies Taiwan – Medium
https://medium.com/pyladies-taiwan/%E8%87%AA%E7%84%B6%E8%AA%9E%E8%A8%80%E8%99%95%E7%90%86%E5%85%A5%E9%96%80-word2vec%E5%B0%8F%E5%AF%A6%E4%BD%9C-f8832d9677c8

科技大擂台 詞向量介紹
https://fgc.stpi.narl.org.tw/activity/videoDetail/4b1141305ddf5522015de5479f4701b1

自然語言處理入門- Word2vec小實作 – PyLadies Taiwan – Medium
https://medium.com/pyladies-taiwan/%E8%87%AA%E7%84%B6%E8%AA%9E%E8%A8%80%E8%99%95%E7%90%86%E5%85%A5%E9%96%80-word2vec%E5%B0%8F%E5%AF%A6%E4%BD%9C-f8832d9677c8

奇異值分解 (SVD) | 線代啟示錄
https://ccjou.wordpress.com/2009/09/01/奇異值分解-svd/

-----

No comments: