Wednesday, September 06, 2017

Python Spark ML(四):Decision Tree Survey - 資料補充

Python Spark ML(四):Decision Tree Survey - 資料補充

2017/09/12

本次沒有作業。

若有任何討論,可以回應至:
https://www.facebook.com/groups/pythontw/permalink/10156873112708438/

-----

Summary:

機器學習 [1],包含許多演算法 [2]。決策樹則屬於監督式學習的一種 [3], [4]。

分類樹可以使用 entropy 與 Gini 作為參考指標 [5]-[8],回歸樹則使用 variance [18]。

決策樹較深入的探討可以參考 [9]-[14],實作可以參考 [15], [16]。

Python 跟 Spark 都有套件支持決策樹 [17], [18]。本活動(Python Spark ML)後續主要會參考 [18]。

-----

p.s. 上次的作業,有不少同學提供了很好的補充材料,我把這些材料整理一下,作為 [19] 的補充。

在此一一標明姓名以謝謝這些認真作答的同學:)

Charles Wang
章銘恒
王仁佑
王得懿
吳政龍
Yu-wei Chen
陳敬翔
Mirage Chung

-----


Fig. 1. Decision tree regression [17].

-----

References

概論:

[1] AI - Ch13 機器學習(1), 機器學習簡介與監督式學習 Introduction to Machine Learning, Supervised Learning _ Mr. Opengate
http://mropengate.blogspot.tw/2015/05/ai-supervised-learning.html

[2] 演算法筆記 - Classification
http://www.csie.ntnu.edu.tw/~u91029/Classification.html

[3] 最直覺的分類--決策樹 _ 幣圖誌Bituzi - 挑戰市場規則
http://www.bituzi.com/2014/12/the-most-intuitive-classification-dicision-tree.html

[4] Behavior tree (artificial intelligence, robotics and control) - Wikipedia
https://en.wikipedia.org/wiki/Behavior_tree_%28artificial_intelligence,_robotics_and_control%29

-----

決策樹的基礎:
 
[5] 决策树的数学原理 - liuzhiqiangruc - ITeye博客
http://liuzhiqiangruc.iteye.com/blog/2289986

[6] 怎样理解 Cross Entropy _ MemoMemoMemo
http://shuokay.com/2017/06/23/cross-entropy/

[7] 綠角財經筆記  什麼是吉尼係數(What is Gini Coefficient )
http://greenhornfinancefootnote.blogspot.tw/2011/09/what-is-gini-coefficient.html

[8] 綠角財經筆記  世界各國的吉尼係數(Gini Coefficients Data)
http://greenhornfinancefootnote.blogspot.tw/2011/09/gini-coefficients-data.html

-----

大學的教材:

[9] Construction of Decision Trees
https://dspace.mit.edu/bitstream/handle/1721.1/5845/AIM-189.pdf?sequence=2 

[10] Simplifying Decision Tree
https://dspace.mit.edu/bitstream/handle/1721.1/6453/AIM-930.pdf?sequence=2 

[11] Decision Tree Learning
http://www.cs.princeton.edu/courses/archive/spr07/cos424/papers/mitchell-dectrees.pdf

[12] The alternating decision tree learning algorithm
https://cseweb.ucsd.edu/~yfreund/papers/atrees.pdf 

-----

深入決策樹:

[13] Why do Decision Trees Work  – Win-Vector Blog
http://www.win-vector.com/blog/2017/01/why-do-decision-trees-work/ 

-----

應用:

[14] Using Decision Trees to predict infant birth weights _ DataScience+
https://datascienceplus.com/using-decision-trees-to-predict-infant-birth-weights/?fref=gc&dti=197223143437

-----

實作:

[15] learn_python_for_a_r_user_day23.md at master · yaojenkuo_learn_python_for_a_r_user · GitHub
https://github.com/yaojenkuo/learn_python_for_a_r_user/blob/master/day23.md

[16] How To Implement The Decision Tree Algorithm From Scratch In Python
https://machinelearningmastery.com/implement-decision-tree-algorithm-scratch-python/

[17] SCIKIT-Decsion Trees
http://scikit-learn.org/stable/modules/tree.html

[18] Spark 2- Decision Tree
https://spark.apache.org/docs/latest/mllib-decision-tree.html

[19] Python Spark ML(三):Decision Tree Survey
https://hemingwang.blogspot.tw/2017/09/python-spark-mldecision-tree-survey.html

No comments: