基于Span压缩的相对值迭代算法 Relative Value Iteration Algorithm Based on Contraction Span Semi-Norm期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

按检索

基于Span压缩的相对值迭代算法

引用本文：	胡光华,吴沧浦.基于Span压缩的相对值迭代算法[J].运筹学学报,1999,3(2):1-9.

作者姓名：	胡光华吴沧浦

作者单位：	北京理工大学自动控制系!北京，100081

摘要：	本文研究平均报酬马氏决策过程（MDP）的相对值迭代算法．给出了span半范数压缩因子的一个表达式，证明了该因子小于1时本文绘出的相对值迭代算法及小步长相对值迭代算法均收敛到其最优解．
关键词：	马氏决策过程压缩映射动态规划平均报酬
Relative Value Iteration Algorithm Based on Contraction Span Semi-Norm

GUANGHUA HU, CANGPU WU.Relative Value Iteration Algorithm Based on Contraction Span Semi-Norm[J].OR Transactions,1999,3(2):1-9.

Authors:	GUANGHUA HU CANGPU WU

Abstract:	In this paper, the relative value iteration algorithm for average reward Markov decision processes (MDP)is investigated. A formulation of contraction factor of span seminorm is given, the convergence of relative value iteration (RVI) algorithm and the smallstep RVI algorithm are proved under a condition of the contraction span semi-norm.

Keywords:	Markov decision processes contraction mappings dynamic programming average reward
本文献已被 CNKI 维普等数据库收录！