Learning from Unlabeled Interaction Frames

Jonathan Grizou

Résumé

This thesis investigates how a machine can be taught a new task from unlabeled human instructions, which is without knowing beforehand how to associate the human communicative signals with their meanings. The theoretical and empirical work presented in this thesis provides means to create calibration free interactive systems, which allow humans to interact with machines, from scratch, using their own preferred teaching signals. It therefore removes the need for an expert to tune the system for each specific user, which constitutes an important step towards flexible personalized teaching interfaces, a key for the future of personal robotics.Our approach assumes the robot has access to a limited set of task hypotheses, which include the task the user wants to solve. Our method consists of generating interpretation hypotheses of the teaching signals with respect to each hypothetic task. By building a set of hypothetic interpretation, i.e. a set of signal-label pairs for each task, the task the user wants to solve is the one that explains better the history of interaction.We consider different scenarios, including a pick and place robotics experiment with speech as the modality of interaction, and a navigation task in a brain computer interaction scenario. In these scenarios, a teacher instructs a robot to perform a new task using initially unclassified signals, whose associated meaning can be a feedback (correct/incorrect) or a guidance (go left, right, up, \ldots). Our results show that a) it is possible to learn the meaning of unlabeled and noisy teaching signals, as well as a new task at the same time, and b) it is possible to reuse the acquired knowledge about the teaching signals for learning new tasks faster. We further introduce a planning strategy that exploits uncertainty from the task and the signals' meanings to allow more efficient learning sessions. We present a study where several real human subjects control successfully a virtual device using their brain and without relying on a calibration phase. Our system identifies, from scratch, the target intended by the user as well as the decoder of brain signals.Based on this work, but from another perspective, we introduce a new experimental setup to study how humans behave in asymmetric collaborative tasks. In this setup, two humans have to collaborate to solve a task but the channels of communication they can use are constrained and force them to invent and agree on a shared interaction protocol in order to solve the task. These constraints allow analyzing how a communication protocol is progressively established through the interplay and history of individual actions.

Cette thèse s'intéresse à un problème logique dont les enjeux théoriques et pratiques sont multiples. De manière simple, il peut être présenté ainsi : imaginez que vous êtes dans un labyrinthe, dont vous connaissez toutes les routes menant à chacune des portes de sortie. Derrière l'une de ces portes se trouve un trésor, mais vous n'avez le droit d'ouvrir qu'une seule porte. Un vieil homme habitant le labyrinthe connaît la bonne sortie et se propose alors de vous aider à l'identifier. Pour cela, il vous indiquera la direction à prendre à chaque intersection. Malheureusement, cet homme ne parle pas votre langue, et les mots qu'il utilise pour dire ``droite'' ou ``gauche'' vous sont inconnus. Est-il possible de trouver le trésor et de comprendre l'association entre les mots du vieil homme et leurs significations ? Ce problème, bien qu'en apparence abstrait, est relié à des problématiques concrètes dans le domaine de l'interaction homme-machine. Remplaçons le vieil homme par un utilisateur souhaitant guider un robot vers une sortie spécifique du labyrinthe. Ce robot ne sait pas en avance quelle est la bonne sortie mais il sait où se trouvent chacune des portes et comment s'y rendre. Imaginons maintenant que ce robot ne comprenne pas a priori le langage de l'humain; en effet, il est très difficile de construire un robot à même de comprendre parfaitement chaque langue, accent et préférence de chacun. Il faudra alors que le robot apprenne l'association entre les mots de l'utilisateur et leur sens, tout en réalisant la tâche que l'humain lui indique (i.e. trouver la bonne porte). Une autre façon de décrire ce problème est de parler d'auto-calibration. En effet, le résoudre reviendrait à créer des interfaces ne nécessitant pas de phase de calibration car la machine pourrait s'adapter, automatiquement et pendant l'interaction, à différentes personnes qui ne parlent pas la même langue ou qui n'utilisent pas les mêmes mots pour dire la même chose. Cela veut aussi dire qu'il serait facile de considérer d’autres modalités d'interaction (par exemple des gestes, des expressions faciales ou des ondes cérébrales). Dans cette thèse, nous présentons une solution à ce problème. Nous appliquons nos algorithmes à deux exemples typiques de l'interaction homme-robot et de l'interaction cerveau-machine : une tâche d'organisation d'une série d'objets selon les préférences de l'utilisateur qui guide le robot par la voix, et une tâche de déplacement sur une grille guidé par les signaux cérébraux de l'utilisateur. Ces dernières expériences ont été faites avec des utilisateurs réels. Nos résultats démontrent expérimentalement que notre approche est fonctionnelle et permet une utilisation pratique d’une interface sans calibration préalable.

Learning from Unlabeled Interaction Frames

Apprentissage simultané d'une tâche nouvelle et de l'interprétation de signaux sociaux d'un humain en robotique

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Partager