首页 | 本学科首页   官方微博 | 高级检索  
     

非时齐部分可观察Markov决策规划的最优策略问题
引用本文:张继红,郭世贞,章芸. 非时齐部分可观察Markov决策规划的最优策略问题[J]. 运筹学学报, 2004, 8(2): 81-87
作者姓名:张继红  郭世贞  章芸
作者单位:1. 中国科学院数学与系统科学研究院,100080,北京;北京外国语大学国际商学院,100089,北京
2. 昆明理工大学理学院,650091,昆明
基金项目:国家自然科学基金资助项目
摘    要:本文讨论了一类非时齐部分可观察Markov决策模型.在不改变状态空间可列性的条件下,把该模型转化为[5]中的一般化折扣模型,从而解决了其最优策略问题,并且得到了该模型的有限阶段逼近算法,其中该算法涉及的状态是可列的.

关 键 词:部分可观察Markov决策规划 最优策略 非时齐 折扣模型 逼近
修稿时间:2000-07-10

Partially Observable Non-homogeneous Markov Decision Processes
Zhang Jihong Guo Shizheng Zhang Yun Academy of Mathematics and System Science,Chinese Academy of Sciences,Beijing,,China, School of International Business,Beijing Foreign Studies University,Beijing ,China, School of sciences,Kunming University of Science and Technology,Kunming,,China. Partially Observable Non-homogeneous Markov Decision Processes[J]. OR Transactions, 2004, 8(2): 81-87
Authors:Zhang Jihong Guo Shizheng Zhang Yun Academy of Mathematics  System Science  Chinese Academy of Sciences  Beijing    China   School of International Business  Beijing Foreign Studies University  Beijing   China   School of sciences  Kunming University of Science  Technology  Kunming    China
Affiliation:Zhang Jihong Guo Shizheng Zhang Yun Academy of Mathematics and System Science,Chinese Academy of Sciences,Beijing,100080,China, School of International Business,Beijing Foreign Studies University,Beijing 100089,China, School of sciences,Kunming University of Science and Technology,Kunming,650091,China,
Abstract:In this paper, we discuss the optimal policy problem of partially observable non-homogeneous Markov decision processes(i.e. NPOMDP). By a new handling method, the NPOMDP model is successfully transformed into the generalized discount Markov decision processes with countable state space in [5]. From this, the optimal policy problem of the model is solved and its finite-horizon approximation algorithm is given, and in which the state space this algorithm involved is countable.
Keywords:OR   partially observable markov decision processes   the non-homogeneous   optimal policy   discounted model   approximation algorithm
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号