首页 | 本学科首页   官方微博 | 高级检索  
     检索      


Denumerable controlled Markov chains with average reward criterion: Sample path optimality
Authors:Rolando Cavazos-Cadena  Emmanuel Fernández-Gaucherand
Institution:(1) Departamento de Estadística y Cálculo, Universidad Autónoma Agraria Antonio Narro, Buenavista, 25315 Saltillo, Coah. Mexico;(2) Systems & Industrial Engineering Department, The University of Arizona, 85721 Tucson, AZ, USA
Abstract:We consider discrete-time nonlinear controlled stochastic systems, modeled by controlled Makov chains with denumerable state space and compact action space. The corresponding stochastic control problem of maximizing average rewards in the long-run is studied. Departing from the most common position which usesexpected values of rewards, we focus on a sample path analysis of the stream of states/rewards. Under a Lyapunov function condition, we show that stationary policies obtained from the average reward optimality equation are not only average reward optimal, but indeed sample path average reward optimal, for almost all sample paths.Research supported by a U.S.-México Collaborative Research Program funded by the National Science Foundation under grant NSF-INT 9201430, and by CONACyT-MEXICO.Partially supported by the MAXTOR Foundation for applied Probability and Statistics, under grant No. 01-01-56/04-93.Research partially supported by the Engineering Foundation under grant RI-A-93-10, and by a grant from the AT&T Foundation.
Keywords:Denumerable Controlled Markov Chains  Average Reward Criterion  Sample Path Optimality  Lyapunov Function Condition
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号