首页 | 本学科首页   官方微博 | 高级检索  
     


Equivalence of Lyapunov stability criteria in a class of Markov decision processes
Authors:Rolando Cavazos-Cadena  Onésimo Hernández-Lerma
Affiliation:(1) Departamento de Estadistica y Cálculo, Universidad Autónoma Agraria Antonio Narro, Buenavista 25315, Saltillo, COAH, México;(2) Departamento de Matemáticas, CINVESTAV-IPN, Apartado Postal 14-740, 07000 México D.F., Mexico
Abstract:We are concerned with Markov decision processes with countable state space and discrete-time parameter. The main structural restriction on the model is the following: under the action of any stationary policy the state space is acommunicating class. In this context, we prove the equivalence of ten stability/ergodicity conditions on the transition law of the model, which imply the existence of average optimal stationary policies for an arbitrary continuous and bounded reward function; these conditions include the Lyapunov function condition (LFC) introduced by A. Hordijk. As a consequence of our results, the LFC is proved to be equivalent to the following: under the action of any stationary policy the corresponding Markov chain has a unique invariant distribution which depends continuously on the stationary policy being used. A weak form of the latter condition was used by one of the authors to establish the existence of optimal stationary policies using an approach based on renewal theory.This research was supported in part by the Third World Academy of Sciences (TWAS) under Grant TWAS RG MP 898-152.
Keywords:Markov decision processes  Average reward criterion  Lyapunov stability criteria  Lyapunov function condition  Continuity of the invariant distribution  Renewal theory
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号