Borel状态空间非平稳MDP的平均方差准则 The Average Variance Criterion for Nonstationary MDP with Borel State Space期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

Borel状态空间非平稳MDP的平均方差准则

引用本文：	郭先平.Borel状态空间非平稳MDP的平均方差准则[J].数学学报,2001,44(2):333-342.

作者姓名：	郭先平

作者单位：	中山大学数学系

基金项目：	国家自然科学基金(19901038)；广东省自然科学基金；香港中山大学高等学术中心基金会资助课题

摘要：	本文考虑具有Ｂｏｒｅｌ状态空间和行动空间非平稳ＭＤＰ的平均方差准则．首先，在遍历条件下，利用最优方程，证明了关于平均期望目标最优马氏策略的存在性．然后，通过构造新的模型，利用马氏过程的理论，进一步证明了在关于平均期望目标是最优的一类马氏策略中，存在一个马氏策略使得平均方差达到最小．作为本文的特例还得到了ＤｙｎｋｉｎＥ．Ｂ．和ＹｕｓｈｋｅｖｉｃｈＡ．Ａ．及ＫｕｒａｎｏＭ．等中的主要结果．
关键词：	非平稳MDP 平均方差目标最优方程最优马氏策略
文章编号：	0583-1431(2001)02-0333-10
修稿时间：	1998年3月31日
The Average Variance Criterion for Nonstationary MDP with Borel State Space

GUO Xian Ping.The Average Variance Criterion for Nonstationary MDP with Borel State Space[J].Acta Mathematica Sinica,2001,44(2):333-342.

Authors:	GUO Xian Ping

Institution:	GUO Xian Ping (Department of Mathematics, Zhongshan University, Guangzhou 510275, P. R. China)

Abstract:	In this paper, we consider the average variance criterion for nonstationary Markov decision processes (MDP) with Borel state space. First, from the optimality equations we prove the existence of optimal Markov policies under ergodic conditions. Secondly, by the theory on Markov processes and structuring a new model we also prove that there exists a Markov policy, which is optimal in an average expected criterion, minimizes the average variance in the class of optimal policies for average expected criterion. So we extend the main results obtained by Dynkin E. B. and Yushkevich A. A. and by Kurano M. etc.

Keywords:	Nonstationary MDP Average variance criterion Optimality equations Optimal Markov policies
本文献已被 CNKI 维普万方数据等数据库收录！
	点击此处可从《数学学报》浏览原始摘要信息
	点击此处可从《数学学报》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏