首页 | 本学科首页   官方微博 | 高级检索  
     检索      


A survey of recent results on continuous-time Markov decision processes
Authors:Xianping Guo  Onésimo Hernández-Lerma  Tomás Prieto-Rumeau  Xi-Ren Cao  Junyu Zhang  Qiying Hu  Mark E Lewis  Ricardo Vélez
Institution:(1) Zhongshan University, P.R. China;(2) CINVESTAV-IPN, Mexico;(3) Universidad Nacional de Educación a Distancia, Spain;(4) Hong Kong University of Science and Technology, Hong Kong;(5) Shanghai University, China;(6) Cornell University, USA;(7) Universidad Nacinal de Educación a Distancia, Spain
Abstract:This paper is a survey of recent results on continuous-time Markov decision processes (MDPs) withunbounded transition rates, and reward rates that may beunbounded from above and from below. These results pertain to discounted and average reward optimality criteria, which are the most commonly used criteria, and also to more selective concepts, such as bias optimality and sensitive discount criteria. For concreteness, we consider only MDPs with a countable state space, but we indicate how the results can be extended to more general MDPs or to Markov games. Research partially supported by grants NSFC, DRFP and NCET. Research partially supported by CONACyT (Mexico) Grant 45693-F.
Keywords:Continuous-time Markov decision processes (also known as controlled Markov chains)  unbounded reward and transition rates  discounted reward  average reward  bias optimality  sensitive discount criteria
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号