首页 | 本学科首页   官方微博 | 高级检索  
     


Contractivity of Bellman operator in risk averse dynamic programming with infinite horizon
Affiliation:1. Charles University, Faculty of Mathematics and Physics, Department of Probability and Mathematical Statistics, Sokolovska 83, Prague, Czech Republic;2. The Czech Academy of Sciences, Institute of Information Theory and Automation, Pod Vodarenskou vezi 4, Prague, Czech Republic
Abstract:The paper deals with a risk averse dynamic programming problem with infinite horizon. First, the required assumptions are formulated to have the problem well defined. Then the Bellman equation is derived, which may be also seen as a standalone reinforcement learning problem. The fact that the Bellman operator is contraction is proved, guaranteeing convergence of various solution algorithms used for dynamic programming as well as reinforcement learning problems, which we demonstrate on the value iteration and the policy iteration algorithms.
Keywords:Risk aversion  Dynamic programming  Infinite horizon
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号