首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Constrained denumerable state non-stationary MDPs with expected total reward criterion
Authors:Guo Xianping
Institution:(1) Department of Mathematics, Zhongshan University, 510275 Guangzhou, China
Abstract:In this paper, we consider constrained denumerable state non-stationary Markov decision processes (MDPs, for short) with expected total reward criterion. By the mechanics of intro- ducing Lagrange multiplier and using the methods of probability and analytics, we prove the existence of constrained optimal policies. Moreover, we prove that a constrained optimal policy may be a Markov policy, or be a randomized Markov policy that randomizes between two Markov policies, that differ in only one state.
Keywords:Non-stationary MDPs  expected total reward criterion  constrained optimal policies
本文献已被 CNKI SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号