Thursday, February 25, 2021

Reinforcement Learning 1:An Introduction

強化學習(一):簡介

2020/04/17

-----


Demystifying Deep Reinforcement Learning | Computational Neuroscience Lab
https://neuro.cs.ut.ee/demystifying-deep-reinforcement-learning/

-----


Deep Reinforcement Learning

TD(Q-learning):DQN、DDQN、DNA、NAF、C51、QR-DQN、HER、DQfD、Rainbow。
AC(Actor-Critic):A3C(A2C)、(DRQN)UNREAL、DPG、DDPG、TD3、SAC、ACKTR。
PG(REINFORCE):TRPO、PPO、PDO、CPO、IPO。

-----


// AlphaGo [1]。

-----


// DQN [1]。

-----


// NAS-RL [2]。

-----


// SARS [1]。

-----


Demystifying Deep Reinforcement Learning | Computational Neuroscience Lab
https://neuro.cs.ut.ee/demystifying-deep-reinforcement-learning/

-----


Q-Learning : A Maneuver of Mazes - Becoming Human: Artificial Intelligence Magazine
https://becominghuman.ai/q-learning-a-maneuver-of-mazes-885137e957e4

-----


My Journey to Reinforcement Learning — Part 1: Q-Learning with Table
https://towardsdatascience.com/my-journey-to-reinforcement-learning-part-1-q-learning-with-table-35540020bcf9

-----


Q-Learning : A Maneuver of Mazes - Becoming Human: Artificial Intelligence Magazine
https://becominghuman.ai/q-learning-a-maneuver-of-mazes-885137e957e4

-----


Introduction to Reinforcement Learning — Deep Reinforcement Learning for Hackers (Part 0)
https://medium.com/@curiousily/getting-your-feet-rewarded-deep-reinforcement-learning-for-hackers-part-0-900ca5bb83e5

-----


Q-learning - Wikipedia
https://en.m.wikipedia.org/wiki/Q-learning

-----


Introduction to Deep Q-Learning for Reinforcement Learning (in Python)
https://www.analyticsvidhya.com/blog/2019/04/introduction-deep-q-learning-python/

-----


// Sarsa [3]。

-----


// Q-learning [3]。

-----


// Sarsa and Q-learning [4]。

-----


Deep-Learning-Papers-Reading-Roadmap/README.md at master · floodsung/Deep-Learning-Papers-Reading-Roadmap · GitHub
https://github.com/floodsung/Deep-Learning-Papers-Reading-Roadmap/blob/master/README.md

-----


// 強化學習演進路線 [5]。

-----


// 一些強化學習的演算法 [6], [7]。

-----


[1708.07902] Deep Learning for Video Game Playing
https://arxiv.org/abs/1708.07902

-----


[1910.09615] IPO: Interior-point Policy Optimization under Constraints
https://arxiv.org/abs/1910.09615

-----


[1910.09615] IPO: Interior-point Policy Optimization under Constraints
https://arxiv.org/abs/1910.09615

-----


用Python實作強化學習|使用TensorFlow與OpenAI Gym
http://books.gotop.com.tw/v_ACD017800

-----


GitHub - Curt-Park/rainbow-is-all-you-need: Rainbow is all you need! A step-by-step tutorial from DQN to Rainbow
https://github.com/Curt-Park/rainbow-is-all-you-need

-----


GitHub - MrSyee/pg-is-all-you-need: Policy Gradient is all you need! A step-by-step tutorial for well-known PG methods.
https://github.com/MrSyee/pg-is-all-you-need

-----

References

[1] 強化學習 Reinforcement Learning
https://www.slideshare.net/mobile/yenlung/reinforcement-learning-90737484

[2] [論文閱讀]Neural Architecture Search with Reinforcement Learning – AMMAI
https://sss050531.wordpress.com/2018/06/09/論文閱讀neural-architecture-search-with-reinforcement-learning/

[3] 强化学习(七)--Q-Learning和Sarsa - 知乎
https://zhuanlan.zhihu.com/p/46850008

[4] artificial intelligence - What is the difference between Q-learning and SARSA? - Stack Overflow
https://stackoverflow.com/questions/6848828/what-is-the-difference-between-q-learning-and-sarsa

[5] 强化学习演进路线 - 知乎
https://zhuanlan.zhihu.com/p/49429128

[6] Reinforcement Learning algorithms — an intuitive overview
https://medium.com/@SmartLabAI/reinforcement-learning-algorithms-an-intuitive-overview-904e2dff5bbc

[7] Part 2: Kinds of RL Algorithms — Spinning Up documentation
https://spinningup.openai.com/en/latest/spinningup/rl_intro2.html

No comments: