Blackwell optimality in the class of stationary policies in Markov decision chains with a Borel state space and unbounded rewards期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

Blackwell optimality in the class of stationary policies in Markov decision chains with a Borel state space and unbounded rewards

Authors:	Arie Hordijk Alexander A Yushkevich

Institution:	Department of Mathematics and Computer Science, Leiden University, 2300 RA Leiden, The Netherlands (e-mail: hordijk@wi.leidenuniv.nl), NL Department of Mathematics, University of North Carolina at Charlotte, Charlotte, NC 28223, USA (e-mail: aayushke@email.uncc.edu), US

Abstract:	This paper is the first part of a study of Blackwell optimal policies in Markov decision chains with a Borel state space and unbounded rewards. We prove here the existence of deterministic stationary policies which are Blackwell optimal in the class of all, in general randomized, stationary policies. We establish also a lexicographical policy improvement algorithm leading to Blackwell optimal policies and the relation between such policies and the Blackwell optimality equation. Our technique is a combination of the weighted norms approach developed in Dekker and Hordijk (1988) for countable models with unbounded rewards and of the weak-strong topology approach used in Yushkevich (1997a) for Borel models with bounded rewards.

Keywords:
本文献已被 SpringerLink 等数据库收录！