A survey of recent results on continuous-time Markov decision processes期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

A survey of recent results on continuous-time Markov decision processes

Authors:	Xianping Guo Onésimo Hernández-Lerma Tomás Prieto-Rumeau Xi-Ren Cao Junyu Zhang Qiying Hu Mark E Lewis Ricardo Vélez

Institution:	(1) Zhongshan University, P.R. China;(2) CINVESTAV-IPN, Mexico;(3) Universidad Nacional de Educación a Distancia, Spain;(4) Hong Kong University of Science and Technology, Hong Kong;(5) Shanghai University, China;(6) Cornell University, USA;(7) Universidad Nacinal de Educación a Distancia, Spain

Abstract:	This paper is a survey of recent results on continuous-time Markov decision processes (MDPs) withunbounded transition rates, and reward rates that may beunbounded from above and from below. These results pertain to discounted and average reward optimality criteria, which are the most commonly used criteria, and also to more selective concepts, such as bias optimality and sensitive discount criteria. For concreteness, we consider only MDPs with a countable state space, but we indicate how the results can be extended to more general MDPs or to Markov games. Research partially supported by grants NSFC, DRFP and NCET. Research partially supported by CONACyT (Mexico) Grant 45693-F.

Keywords:	Continuous-time Markov decision processes (also known as controlled Markov chains) unbounded reward and transition rates discounted reward average reward bias optimality sensitive discount criteria
本文献已被 SpringerLink 等数据库收录！