Denumerable controlled Markov chains with average reward criterion: Sample path optimality期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

Denumerable controlled Markov chains with average reward criterion: Sample path optimality

Authors:	Rolando Cavazos-Cadena Emmanuel Fernández-Gaucherand

Institution:	(1) Departamento de Estadística y Cálculo, Universidad Autónoma Agraria Antonio Narro, Buenavista, 25315 Saltillo, Coah. Mexico;(2) Systems & Industrial Engineering Department, The University of Arizona, 85721 Tucson, AZ, USA

Abstract:	We consider discrete-time nonlinear controlled stochastic systems, modeled by controlled Makov chains with denumerable state space and compact action space. The corresponding stochastic control problem of maximizing average rewards in the long-run is studied. Departing from the most common position which usesexpected values of rewards, we focus on a sample path analysis of the stream of states/rewards. Under a Lyapunov function condition, we show that stationary policies obtained from the average reward optimality equation are not only average reward optimal, but indeed sample path average reward optimal, for almost all sample paths.Research supported by a U.S.-México Collaborative Research Program funded by the National Science Foundation under grant NSF-INT 9201430, and by CONACyT-MEXICO.Partially supported by the MAXTOR Foundation for applied Probability and Statistics, under grant No. 01-01-56/04-93.Research partially supported by the Engineering Foundation under grant RI-A-93-10, and by a grant from the AT&T Foundation.

Keywords:	Denumerable Controlled Markov Chains Average Reward Criterion Sample Path Optimality Lyapunov Function Condition
本文献已被 SpringerLink 等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏