A Counterexample on Sample-Path Optimality in Stable Markov Decision Chains with the Average Reward Criterion |
| |
Authors: | Rolando Cavazos-Cadena Raúl Montes-de-Oca Karel Sladký |
| |
Institution: | 1. Departamento de Estadística y Cálculo, Universidad Autónoma Agraria Antonio Narro, Buenavista, Saltillo Coah, 25315, Mexico 2. Departamento de Matemáticas, Universidad Autónoma Metropolitana, Campus Iztapalapa, Avenida San Rafael Atlixco 186, Colonia Vicentina, México, 09340, Mexico 3. Institute of Information Theory and Automation, Pod Vodárenskou vě?í 4, 182 08, Praha 8, Czech Republic
|
| |
Abstract: | This note deals with Markov decision chains evolving on a denumerable state space. Under standard continuity-compactness requirements, an explicit example is provided to show that, with respect to a strong sample-path average reward criterion, the Lyapunov function condition does not ensure the existence of an optimal stationary policy. |
| |
Keywords: | |
本文献已被 SpringerLink 等数据库收录! |
|