Markov Decision Processes with Variance Minimization: A New Condition and Approach |
| |
Authors: | Quanxin Zhu Xianping Guo |
| |
Affiliation: | 1. Department of Mathematics , South China Normal University , Guangzhou, P.R. China zqx22@126.com;3. The School of Mathematics and Computational Science , Zhongshan University , Guangzhou, P.R. China |
| |
Abstract: | ![]() Abstract This article deals with the limiting average variance criterion for discrete-time Markov decision processes in Borel spaces. The costs may have neither upper nor lower bounds. We propose another set of conditions under which we prove the existence of a variance minimal policy in the class of average expected cost optimal stationary policies. Our conditions are weaker than those in the previous literature. Moreover, some sufficient conditions for the existence of a variance minimal policy are imposed on the primitive data of the model. In particular, the stochastic monotonicity condition in this paper has been first used to study the limiting average variance criterion. Also, the optimality inequality approach provided here is different from the “optimality equation approach” widely used in the previous literature. Finally, we use a controlled queueing system to illustrate our results. |
| |
Keywords: | Discrete-time Markov decision process Optimality inequality Optimal stationary policy Variance-minimization |
|
|