Constrained denumerable state non-stationary MDPs with expected total reward criterion期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

Constrained denumerable state non-stationary MDPs with expected total reward criterion

Authors:	Guo Xianping

Institution:	(1) Department of Mathematics, Zhongshan University, 510275 Guangzhou, China

Abstract:	In this paper, we consider constrained denumerable state non-stationary Markov decision processes (MDPs, for short) with expected total reward criterion. By the mechanics of intro- ducing Lagrange multiplier and using the methods of probability and analytics, we prove the existence of constrained optimal policies. Moreover, we prove that a constrained optimal policy may be a Markov policy, or be a randomized Markov policy that randomizes between two Markov policies, that differ in only one state.

Keywords:	Non-stationary MDPs expected total reward criterion constrained optimal policies
本文献已被 CNKI SpringerLink 等数据库收录！