Blackwell optimality in the class of stationary policies in Markov decision chains with a Borel state space and unbounded rewards |
| |
Authors: | Arie Hordijk Alexander A Yushkevich |
| |
Institution: | Department of Mathematics and Computer Science, Leiden University, 2300 RA Leiden, The Netherlands (e-mail: hordijk@wi.leidenuniv.nl), NL Department of Mathematics, University of North Carolina at Charlotte, Charlotte, NC 28223, USA (e-mail: aayushke@email.uncc.edu), US
|
| |
Abstract: | This paper is the first part of a study of Blackwell optimal policies in Markov decision chains with a Borel state space and unbounded rewards. We prove here the existence of deterministic stationary policies which are Blackwell optimal in the class of all, in general randomized, stationary policies. We establish also a lexicographical policy improvement algorithm leading to Blackwell optimal policies and the relation between such policies and the Blackwell optimality equation. Our technique is a combination of the weighted norms approach developed in Dekker and Hordijk (1988) for countable models with unbounded rewards and of the weak-strong topology approach used in Yushkevich (1997a) for Borel models with bounded rewards. |
| |
Keywords: | |
本文献已被 SpringerLink 等数据库收录! |
|