Combining latent learning with dynamic programming in the modular anticipatory classifier system期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

Combining latent learning with dynamic programming in the modular anticipatory classifier system

Authors:	Pierre Grard Jean-Arcady Meyer Olivier Sigaud

Institution:	^a Dassault Aviation, DGT/DPR/ESA 78, Quai Marcel Dassault, 92552 St-Cloud, Cedex, France;^b AnimatLab (LIP6), 8 rue du Capitaine Scott, 75015, Paris, France

Abstract:	Learning Classifier Systems (LCS) are rule based Reinforcement Learning (RL) systems which use a generalization capability. In this paper, we highlight the differences between two kinds of LCSs. Some are used to directly perform RL while others latently learn a model of the interactions between the agent and its environment. Such a model can be used to speed up the core RL process. Thus, these two kinds of learning processes are complementary. We show here how the notion of generalization differs depending on whether the system anticipates (like Anticipatory Classifier System (ACS) and Yet Another Classifier System (YACS)) or not (like XCS). Moreover, we show some limitations of the formalism common to ACS and YACS, and propose a new system, called Modular Anticipatory Classifier System (MACS), which allows the latent learning process to take advantage of new regularities. We describe how the model can be used to perform active exploration and how this exploration may be aggregated with the policy resulting from the reinforcement learning process. The different algorithms are validated experimentally and some limitations in presence of uncertainties are highlighted.

Keywords:	Artificial intelligence Reinforcement learning Latent learning Learning classifier systems Dynamic programming
本文献已被 ScienceDirect 等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏