A Counterexample on Sample-Path Optimality in Stable Markov Decision Chains with the Average Reward Criterion期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

A Counterexample on Sample-Path Optimality in Stable Markov Decision Chains with the Average Reward Criterion

Authors:	Rolando Cavazos-Cadena Raúl Montes-de-Oca Karel Sladký

Institution:	1. Departamento de Estadística y Cálculo, Universidad Autónoma Agraria Antonio Narro, Buenavista, Saltillo Coah, 25315, Mexico 2. Departamento de Matemáticas, Universidad Autónoma Metropolitana, Campus Iztapalapa, Avenida San Rafael Atlixco 186, Colonia Vicentina, México, 09340, Mexico 3. Institute of Information Theory and Automation, Pod Vodárenskou vě?í 4, 182 08, Praha 8, Czech Republic

Abstract:	This note deals with Markov decision chains evolving on a denumerable state space. Under standard continuity-compactness requirements, an explicit example is provided to show that, with respect to a strong sample-path average reward criterion, the Lyapunov function condition does not ensure the existence of an optimal stationary policy.

Keywords:
本文献已被 SpringerLink 等数据库收录！