首页 | 本学科首页   官方微博 | 高级检索  
     检索      


A Counterexample on Sample-Path Optimality in Stable Markov Decision Chains with the Average Reward Criterion
Authors:Rolando Cavazos-Cadena  Raúl Montes-de-Oca  Karel Sladký
Institution:1. Departamento de Estadística y Cálculo, Universidad Autónoma Agraria Antonio Narro, Buenavista, Saltillo Coah, 25315, Mexico
2. Departamento de Matemáticas, Universidad Autónoma Metropolitana, Campus Iztapalapa, Avenida San Rafael Atlixco 186, Colonia Vicentina, México, 09340, Mexico
3. Institute of Information Theory and Automation, Pod Vodárenskou vě?í 4, 182 08, Praha 8, Czech Republic
Abstract:This note deals with Markov decision chains evolving on a denumerable state space. Under standard continuity-compactness requirements, an explicit example is provided to show that, with respect to a strong sample-path average reward criterion, the Lyapunov function condition does not ensure the existence of an optimal stationary policy.
Keywords:
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号