The Star Also Rises: Imitation Learning

Imitation Learning

2021/08/17

-----

https://pixabay.com/zh/photos/woman-shopping-lifestyle-beautiful-3040029/

-----

冠狀病毒疾病 (COVID-19) 的預測模型：對最新技術的調查

2021/08/31

[HTML] Forecasting models for coronavirus disease (COVID-19): a survey of the state-of-the-art

GR Shinde, AB Kalamkar, PN Mahalle, N Dey… - SN Computer …, 2020 - Springer

「COVID-19 is a pandemic that has affected over 170 countries around the world. The number of infected and deceased patients has been increasing at an alarming rate in almost all the affected nations. Forecasting techniques can be inculcated thereby assisting in designing …」

被引用 111 次相關文章全部共 9 個版本

「COVID-19 是一種流行病，已影響到全球 170 多個國家/地區。在幾乎所有受影響的國家，感染和死亡患者的數量都以驚人的速度增加。可以灌輸預測技術，從而幫助設計……」

「Composite Monte Carlo decision making under high uncertainty of novel coronavirus epidemic using hybridized deep learning and fuzzy rule induction」延伸的論文

-----

使用機器學習和深度學習算法進行 COVID-19 流行病分析

2021/08/30

COVID-19 epidemic analysis using machine learning and deep learning algorithms

NS Punn, SK Sonbhadra, S Agarwal - MedRxiv, 2020 - medrxiv.org

「The catastrophic outbreak of Severe Acute Respiratory Syndrome-Coronavirus (SARS-CoV-2) also known as COVID-2019 has brought the worldwide threat to the living society. The whole world is putting incredible efforts to fight against the spread of this deadly disease in …」

被引用 91 次相關文章全部共 5 個版本

「嚴重急性呼吸系統綜合症冠狀病毒 (SARS-CoV-2) 也稱為 COVID-2019 的災難性爆發給生命社會帶來了全球性威脅。全世界都在付出難以置信的努力來對抗這種致命疾病在……的傳播。」

-----

基於混合深度學習和模糊規則歸納的新型冠狀病毒流行高不確定性下的複合蒙特卡羅決策

2021/08/29

[HTML] Composite Monte Carlo decision making under high uncertainty of novel coronavirus epidemic using hybridized deep learning and fuzzy rule induction

SJ Fong, G Li, N Dey, RG Crespo… - Applied soft computing, 2020 - Elsevier

「In the advent of the novel coronavirus epidemic since December 2019, governments and authorities have been struggling to make critical decisions under high uncertainty at their best efforts. In computer science, this represents a typical problem of machine learning over …」

被引用 113 次相關文章全部共 16 個版本

「自 2019 年 12 月新型冠狀病毒流行以來，各國政府和當局一直在竭盡全力在高度不確定的情況下努力做出關鍵決策。在計算機科學中，這代表了機器學習的典型問題……」

-----

Structured Learning- 結構化學習

2021/08/28

「SVM (Support Vector Machine, 支援向量機)、Deep Learning 及 Neural Networks 模型的 input-output 都是向量 (vector)；但實際上 input-output 型式會比 vector 更複雜；可能是sequence，list，tree，或是 bounding box…。Structured learning 是要找到一個 function，使其 input 及 output 分別都是 object。」

https://ai4dt.wordpress.com/2018/05/23/structure-learning/

-----

Imitation Learning - YouTube

2021/08/27

https://www.youtube.com/watch?v=rOho-2oJFeA

-----

強化學習 — 模仿學習

2021/08/26

RL — Imitation Learning

「One of the biggest challenges is collecting expert demonstrations. Unless it has a huge business potential, the attached cost can be prohibitive. But technically, there is another major issue. We can never duplicate things exactly. Error accumulates fast in a trajectory and put us into situations that we never deal with before.」

「最大的挑戰之一是收集專家演示。除非它具有巨大的商業潛力，否則附加成本可能會令人望而卻步。但從技術上講，還有另一個主要問題。我們永遠無法完全複製事物。錯誤在一個軌跡中快速累積，並將我們置於以前從未處理過的情況中。」

https://jonathan-hui.medium.com/rl-imitation-learning-ac28116c02fc

-----

40 大模仿學習開源項目

2021/08/25

The Top 40 Imitation Learning Open Source Projects

https://awesomeopensource.com/projects/imitation-learning

-----

KAIST-AILab/deeprl_practice_colab：準備

2021/08/24

KAIST-AILab/deeprl_practice_colab: Preparation for ... - GitHubhttps://github.com › KAIST-AILab › dee...

翻譯這個網頁

Preparation for Deep Reinforcement Learning using Google Colab - GitHub ... Generative Adversarial Imitation Learning (GAIL) [Ho et al. NIPS 2016].

「使用 Google Colab 為深度強化學習做準備 - GitHub ... 生成對抗性模仿學習 (GAIL) [Ho 等人。 NIPS 2016]。」

https://github.com/KAIST-AILab/deeprl_practice_colab

-----

ICML2018 模仿學習教程 - Google

2021/08/23

ICML2018 Imitation Learning Tutorial - Google Siteshttps://sites.google.com › view › icml201...· 翻譯這個網頁

「In this tutorial, we aim to present to researchers and industry practitioners a broad overview of imitation learning techniques and recent applications.」

2018年4月22日 · 上傳者：Hoang Le

「在本教程中，我們旨在向研究人員和行業從業者展示模仿學習技術和近期應用的廣泛概述。」

https://sites.google.com/view/icml2018-imitation-learning/

-----

通過領域自適應元學習觀察人類的一次性模仿

2021/08/22

One-shot imitation from observing humans via domain-adaptive meta-learning

T Yu, C Finn, A Xie, S Dasari, T Zhang… - arXiv preprint arXiv …, 2018 - arxiv.org

「Humans and animals are capable of learning a new behavior by observing others perform the skill just once. We consider the problem of allowing a robot to do the same--learning from a raw video pixels of a human, even when there is substantial domain shift in the …」

被引用 200 次相關文章全部共 7 個版本

「人類和動物能夠通過觀察其他人執行一項技能來學習新的行為。我們考慮了允許機器人做同樣事情的問題——從人類的原始視頻像素中學習，即使......」

-----

一鍵模仿學習

2021/08/21

One-shot imitation learning

Y Duan, M Andrychowicz, BC Stadie, J Ho… - arXiv preprint arXiv …, 2017 - arxiv.org

「Imitation learning has been commonly applied to solve different tasks in isolation. This usually requires either careful feature engineering, or a significant number of samples. This is far from what we desire: ideally, robots should be able to learn from very few …」

被引用 463 次相關文章全部共 11 個版本

「模仿學習已普遍應用於孤立地解決不同的任務。這通常需要仔細的特徵工程或大量樣本。這與我們的願望相去甚遠：理想情況下，機器人應該能夠從極少數人那裡學習……」

-----

Imitation learning: A survey of learning methods

模仿學習：學習方法調查

2021/08/20

A Hussein, MM Gaber, E Elyan, C Jayne - ACM Computing Surveys …, 2017 - dl.acm.org

「Imitation learning techniques aim to mimic human behavior in a given task. An agent (a learning machine) is trained to perform a task from demonstrations by learning a mapping between observations and actions. The idea of teaching by imitation has been around for …」

被引用 335 次相關文章全部共 6 個版本

「模仿學習技術旨在模仿給定任務中的人類行為。通過學習觀察和動作之間的映射，對代理（學習機）進行訓練以執行演示中的任務。模仿教學的想法已經存在了……」

-----

Generative adversarial imitation learning

生成對抗性模仿學習

2021/08/19

J Ho, S Ermon - Advances in neural information processing systems, 2016 - papers.nips.cc

「Consider learning a policy from example expert behavior, without interaction with the expert or access to a reinforcement signal. One approach is to recover the expert's cost function with inverse reinforcement learning, then extract a policy from that cost function with …」

被引用 1393 次相關文章全部共 13 個版本

「考慮從示例專家行為中學習策略，而不與專家交互或訪問強化信號。一種方法是使用逆強化學習恢復專家的成本函數，然後從該成本函數中提取策略，使用……」

-----

A brief overview of Imitation Learning

模仿學習的簡要概述

2021/08/18

「The simplest form of imitation learning is behaviour cloning (BC), which focuses on learning the expert’s policy using supervised learning. An important example of behaviour cloning is ALVINN, a vehicle equipped with sensors, which learned to map the sensor inputs into steering angles and drive autonomously. This project was carried out in 1989 by Dean Pomerleau, and it was also the first application of imitation learning in general.」

「模仿學習的最簡單形式是行為克隆（BC），它側重於使用監督學習來學習專家的策略。行為克隆的一個重要例子是 ALVINN，一種配備傳感器的車輛，它學會了將傳感器輸入映射到轉向角並自動駕駛。該項目由 Dean Pomerleau 於 1989 年開展，也是模仿學習的第一個應用。」

「The way behavioural cloning works is quite simple. Given the expert’s demonstrations, we divide these into state-action pairs, we treat these pairs as i.i.d. examples and finally, we apply supervised learning. The loss function can depend on the application. 」

「行為克隆的工作方式非常簡單。鑑於專家的演示，我們將它們分成狀態-動作對，我們將這些對視為 iid 示例，最後，我們應用監督學習。損失函數可以取決於應用程序。」

https://smartlabai.medium.com/a-brief-overview-of-imitation-learning-8a8a75c44a9c

-----

模仿學習（Imitation Learning）介紹

2021/08/17

「在傳統的強化學習任務中，通常通過計算累積獎賞來學習最優策略（policy），然而在多步決策（sequential decision）中，學習器不能頻繁地得到獎勵，且這種基於累積獎賞及學習方式存在非常巨大的搜索空間。而模仿學習（Imitation Learning）已經能夠很好地解決多步決策問題，在機器人、 NLP 等領域也有很多的應用。」

「模仿學習是指從示教者提供的範例中學習，一般提供人類專家的決策數據，之後就可以把狀態作為特徵（feature），動作作為標記（label）進行分類（對於離散動作）或回歸（對於連續動作）的學習從而得到最優策略模型。模型的訓練目標是使模型生成的狀態-動作軌跡分佈和輸入的軌跡分佈相匹配。類似自動編碼器（Autoencoder）與 GANs。」

https://zhuanlan.zhihu.com/p/25688750

-----

The Star Also Rises

Thursday, August 19, 2021

Imitation Learning

No comments:

Post a Comment