Learning how to reach various goals by autonomous interaction with the environment: unification and comparison of exploration strategies - ENSTA Paris - École nationale supérieure de techniques avancées Paris Accéder directement au contenu
Communication Dans Un Congrès Année : 2013

Learning how to reach various goals by autonomous interaction with the environment: unification and comparison of exploration strategies

Résumé

In the field of developmental robotics, we are particularly interested in the exploration strategies which can drive an agent to learn how to reach a wide variety of goals. In this paper, we unify and compare such strategies, recently shown to be efficient to learn complex non-linear redundant sensorimotor mappings. They combine two main principles. The first one concerns the space in which the learning agent chooses points to explore (motor space vs. goal space). Previous works have shown that learning redundant inverse models could be achieved more efficiently if exploration was driven by goal babbling, triggering reaching, rather than direct motor babbling. Goal babbling is especially efficient to learn highly redundant mappings (e.g the inverse kinematics of a arm). At each time step, the agent chooses a goal in a goal space (e.g uniformly), uses the current knowledge of an inverse model to infer a motor command to reach that goal, observes the corresponding consequence and updates its inverse model according to this new experience. This exploration strategy allows the agent to cover the goal space more efficiently, avoiding to waste time in redundant parts of the sensorimotor space (e.g executing many motor commands that actually reach the same goal). The second principle comes from the field of active learning, where exploration strategies are conceived as an optimization process. Samples in the input space (i.e motor space) are collected in order to minimize a given property of the learning process, e.g the uncertainty or the prediction error of the model. This allows the agent to focus on parts of the sensorimotor space in which exploration is supposed to improve the quality of the model.
Fichier principal
Vignette du fichier
rldm.pdf (99.41 Ko) Télécharger le fichier
poster.rldm.2013.pdf (923.6 Ko) Télécharger le fichier
Origine : Fichiers éditeurs autorisés sur une archive ouverte
Format : Autre

Dates et versions

hal-00922537 , version 1 (27-12-2013)

Identifiants

  • HAL Id : hal-00922537 , version 1

Citer

Clément Moulin-Frier, Pierre-Yves Oudeyer. Learning how to reach various goals by autonomous interaction with the environment: unification and comparison of exploration strategies. 1st Multidisciplinary Conference on Reinforcement Learning and Decision Making (RLDM2013), Princeton University, New Jersey, Oct 2014, Princeton, United States. ⟨hal-00922537⟩
407 Consultations
154 Téléchargements

Partager

Gmail Facebook X LinkedIn More