非时齐部分可观察Markov决策规划的最优策略问题 Partially Observable Non-homogeneous Markov Decision Processes期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

非时齐部分可观察Markov决策规划的最优策略问题

引用本文：	张继红,郭世贞,章芸. 非时齐部分可观察Markov决策规划的最优策略问题[J]. 运筹学学报, 2004, 8(2): 81-87

作者姓名：	张继红郭世贞章芸

作者单位：	1. 中国科学院数学与系统科学研究院,100080,北京；北京外国语大学国际商学院,100089,北京 2. 昆明理工大学理学院,650091,昆明

基金项目：	国家自然科学基金资助项目

摘要：	本文讨论了一类非时齐部分可观察Markov决策模型．在不改变状态空间可列性的条件下，把该模型转化为[5]中的一般化折扣模型，从而解决了其最优策略问题，并且得到了该模型的有限阶段逼近算法，其中该算法涉及的状态是可列的．
关键词：	部分可观察Markov决策规划最优策略非时齐折扣模型逼近
修稿时间：	2000-07-10
Partially Observable Non-homogeneous Markov Decision Processes

Zhang Jihong Guo Shizheng Zhang Yun Academy of Mathematics and System Science,Chinese Academy of Sciences,Beijing,,China, School of International Business,Beijing Foreign Studies University,Beijing ,China, School of sciences,Kunming University of Science and Technology,Kunming,,China. Partially Observable Non-homogeneous Markov Decision Processes[J]. OR Transactions, 2004, 8(2): 81-87

Authors:	Zhang Jihong Guo Shizheng Zhang Yun Academy of Mathematics System Science Chinese Academy of Sciences Beijing China School of International Business Beijing Foreign Studies University Beijing China School of sciences Kunming University of Science Technology Kunming China

Affiliation:	Zhang Jihong Guo Shizheng Zhang Yun Academy of Mathematics and System Science,Chinese Academy of Sciences,Beijing,100080,China, School of International Business,Beijing Foreign Studies University,Beijing 100089,China, School of sciences,Kunming University of Science and Technology,Kunming,650091,China,

Abstract:	In this paper, we discuss the optimal policy problem of partially observable non-homogeneous Markov decision processes(i.e. NPOMDP). By a new handling method, the NPOMDP model is successfully transformed into the generalized discount Markov decision processes with countable state space in [5]. From this, the optimal policy problem of the model is solved and its finite-horizon approximation algorithm is given, and in which the state space this algorithm involved is countable.

Keywords:	OR partially observable markov decision processes the non-homogeneous optimal policy discounted model approximation algorithm
本文献已被 CNKI 维普万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏