Exploiting Additive Structure in Factored MDPs for Reinforcement Learning

Thomas Degris; Olivier Sigaud; Pierre-Henri Wuillemin

doi:10.1007/978-3-540-89722-4_2

Communication Dans Un Congrès Année : 2008

Exploiting Additive Structure in Factored MDPs for Reinforcement Learning

, (1) , (2)

1
2

Thomas Degris

Fonction : Auteur

Olivier Sigaud

Fonction : Auteur
PersonId : 14932
IdHAL : olivier-sigaud
ORCID : 0000-0002-8544-0229
IdRef : 072724714

Animatlab

Pierre-Henri Wuillemin

Fonction : Auteur
PersonId : 8633
IdHAL : pierre-henri-wuillemin
ORCID : 0000-0003-3691-4886
IdRef : 12747627X

DECISION

Résumé

sdyna is a framework able to address large, discrete and stochastic reinforcement learning problems. It incrementally learns a fmdp representing the problem to solve while using fmdp planning techniques to build an efficient policy. spiti, an instantiation of sdyna, uses a planning method based on dynamic programming which cannot exploit the additive structure of a fmdp. In this paper, we present two new instantiations of sdyna, namely ulp and unatlp, using a linear programming based planning method that can exploit the additive structure of a fmdp and address problems out of reach of spiti.

Domaines

Informatique [cs]

Lip6 Publications : Connectez-vous pour contacter le contributeur

https://hal.science/hal-01302178

Soumis le : mercredi 13 avril 2016-16:17:00

Dernière modification le : mardi 11 avril 2023-15:16:28

Dates et versions

hal-01302178 , version 1 (13-04-2016)

Identifiants

HAL Id : hal-01302178 , version 1
DOI : 10.1007/978-3-540-89722-4_2

Citer

Thomas Degris, Olivier Sigaud, Pierre-Henri Wuillemin. Exploiting Additive Structure in Factored MDPs for Reinforcement Learning. European Workshop on Reinforcement Learning, Jun 2008, Villeneuve d’Ascq, France. pp.15-26, ⟨10.1007/978-3-540-89722-4_2⟩. ⟨hal-01302178⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UPMC CNRS LIP6 SORBONNE-UNIVERSITE SU-SCIENCES

81 Consultations

0 Téléchargements

Exploiting Additive Structure in Factored MDPs for Reinforcement Learning

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Altmetric

Partager