首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Blackwell optimality in the class of stationary policies in Markov decision chains with a Borel state space and unbounded rewards
Authors:Arie Hordijk  Alexander A Yushkevich
Institution:Department of Mathematics and Computer Science, Leiden University, 2300 RA Leiden, The Netherlands (e-mail: hordijk@wi.leidenuniv.nl), NL
Department of Mathematics, University of North Carolina at Charlotte, Charlotte, NC 28223, USA (e-mail: aayushke@email.uncc.edu), US
Abstract:This paper is the first part of a study of Blackwell optimal policies in Markov decision chains with a Borel state space and unbounded rewards. We prove here the existence of deterministic stationary policies which are Blackwell optimal in the class of all, in general randomized, stationary policies. We establish also a lexicographical policy improvement algorithm leading to Blackwell optimal policies and the relation between such policies and the Blackwell optimality equation. Our technique is a combination of the weighted norms approach developed in Dekker and Hordijk (1988) for countable models with unbounded rewards and of the weak-strong topology approach used in Yushkevich (1997a) for Borel models with bounded rewards.
Keywords:
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号