CLIC: Curriculum Learning and Imitation for object Control in non-rewarding environments - Archive ouverte HAL Access content directly
Journal Articles IEEE Transactions on Cognitive and Developmental Systems Year : 2021

CLIC: Curriculum Learning and Imitation for object Control in non-rewarding environments

(1) , (2) , (1, 3) , (1)
1
2
3

Abstract

In this paper we study a new reinforcement learning setting where the environment is non-rewarding, contains several possibly related objects of various controllability, where an apt agent Bob acts following its own goals, without necessarily providing helpful demonstrations, and where the objective of an agent is to learn to control objects individually. We present a generic discrete-state discrete-action model of such environments, and an unsupervised reinforcement learning agent called CLIC for Curriculum Learning and Imitation for Control to achieve the desired objective. CLIC selects objects to focus on when training and imitating by maximizing its learning progress. We show that CLIC can effectively observe Bob to gain control of objects faster, even if Bob is not explicitly teaching. Despite choosing what it imitates in a principled way, CLIC retains the natural ability to follow Bob when he provides ordered demonstrations. Finally, we show that compared with a non-curriculum based agent, when Bob controls objects that the agent cannot, or in presence of a hierarchy between objects in the environment, CLIC achieves faster mastery of the environment by ignoring non-reproducible and already mastered interactions with objects when imitating.

Dates and versions

hal-02370859 , version 1 (19-11-2019)

Identifiers

Cite

Pierre Fournier, Cédric Colas, Mohamed Chetouani, Olivier Sigaud. CLIC: Curriculum Learning and Imitation for object Control in non-rewarding environments. IEEE Transactions on Cognitive and Developmental Systems, 2021, 13 (2), pp.239-248. ⟨10.1109/TCDS.2019.2933371⟩. ⟨hal-02370859⟩
101 View
0 Download

Altmetric

Share

Gmail Facebook Twitter LinkedIn More