Demonstration Guided Actor-Critic Deep Reinforcement Learning for Fast Teaching of Robots in Dynamic Environments - Archive ouverte HAL Access content directly
Journal Articles IFAC-PapersOnLine Year : 2020

Demonstration Guided Actor-Critic Deep Reinforcement Learning for Fast Teaching of Robots in Dynamic Environments

(1) , (1) , (1) , (1) , (2, 1) , (3, 2) , (4) , (4)
1
2
3
4

Abstract

Using direct reinforcement learning (RL) to accomplish a task can be very ine cient, especially in robotic configurations where interactions with the environment are lengthy and costly. Instead, learning from expert demonstration (LfD) is an alternative approach to gain better performance in an RL setting, which also greatly improves sample e ciency. We propose a novel demonstration learning framework for actor-critic based algorithms. Firstly, we put forward an environment pre-training paradigm to initialize the model parameters without interacting with the target environment, which e↵ectively avoids the cold start problem in deep RL scenarios.Secondly, we design a general-purpose LfD framework for most of the mainstream actor-critic RL algorithms that include a policy network and a value function like PPO, SAC, TRPO, A3C. Thirdly,we build a dedicated model training platform to perform the humanrobot interaction and numerical experimentation. We evaluate the method in six Mujoco simulated locomotion environments and our robot control simulation platform. Results show that several epochs of pre-training can improve the agent's performance over the early stage of training. Also, the final converged performance of the RL algorithm is also boosted by external demonstration. In general the sample e ciency is improved by 30% with the proposed method. Our demonstration pipeline makes full use of the exploration property of the RL algorithm, which is feasible for fast teaching robots in dynamic environments.
Fichier principal
Vignette du fichier
CPHS2020.pdf (3.03 Mo) Télécharger le fichier
Origin : Files produced by the author(s)

Dates and versions

hal-03434380 , version 1 (18-11-2021)

Identifiers

Cite

Liang Gong, Te Sun, Xudong Li, Ke Lin, Natalia Díaz-Rodríguez, et al.. Demonstration Guided Actor-Critic Deep Reinforcement Learning for Fast Teaching of Robots in Dynamic Environments. IFAC-PapersOnLine, 2020, 53 (5), pp.271-278. ⟨10.1016/j.ifacol.2021.04.227⟩. ⟨hal-03434380⟩
38 View
90 Download

Altmetric

Share

Gmail Facebook Twitter LinkedIn More