Equivalence of Lyapunov stability criteria in a class of Markov decision processes |
| |
Authors: | Rolando Cavazos-Cadena Onésimo Hernández-Lerma |
| |
Affiliation: | (1) Departamento de Estadistica y Cálculo, Universidad Autónoma Agraria Antonio Narro, Buenavista 25315, Saltillo, COAH, México;(2) Departamento de Matemáticas, CINVESTAV-IPN, Apartado Postal 14-740, 07000 México D.F., Mexico |
| |
Abstract: | We are concerned with Markov decision processes with countable state space and discrete-time parameter. The main structural restriction on the model is the following: under the action of any stationary policy the state space is acommunicating class. In this context, we prove the equivalence of ten stability/ergodicity conditions on the transition law of the model, which imply the existence of average optimal stationary policies for an arbitrary continuous and bounded reward function; these conditions include the Lyapunov function condition (LFC) introduced by A. Hordijk. As a consequence of our results, the LFC is proved to be equivalent to the following: under the action of any stationary policy the corresponding Markov chain has a unique invariant distribution which depends continuously on the stationary policy being used. A weak form of the latter condition was used by one of the authors to establish the existence of optimal stationary policies using an approach based on renewal theory.This research was supported in part by the Third World Academy of Sciences (TWAS) under Grant TWAS RG MP 898-152. |
| |
Keywords: | Markov decision processes Average reward criterion Lyapunov stability criteria Lyapunov function condition Continuity of the invariant distribution Renewal theory |
本文献已被 SpringerLink 等数据库收录! |
|