首页 | 本学科首页   官方微博 | 高级检索  
文章检索
  按 检索   检索词:      
出版年份:   被引次数:   他引次数: 提示:输入*表示无穷大
  国内免费   5篇
  收费全文   1篇
  完全免费   1篇
  数学   7篇
  2016年   1篇
  2013年   1篇
  2011年   1篇
  2010年   1篇
  2009年   1篇
  1999年   2篇
排序方式: 共有7条查询结果,搜索用时 78 毫秒
1
1.
非负费用折扣半马氏决策过程   总被引:1,自引:0,他引:1       下载免费PDF全文
黄永辉  郭先平 《数学学报》2010,53(3):503-514
本文考虑可数状态非负费用的折扣半马氏决策过程.首先在给定半马氏决策核和策略下构造一个连续时间半马氏决策过程,然后用最小非负解方法证明值函数满足最优方程和存在ε-最优平稳策略,并进一步给出最优策略的存在性条件及其一些性质.最后,给出了值迭代算法和一个数值算例.  相似文献
2.
本文考虑可数状态空间非平稳马尔可夫决策过程(MDP)的平均目标.首先,我们指出并改正了Park,et,al[1]和Alden,etal[2]的错误,并在弱于Park,etal[1]的条件下,借助于新建立的最优方程,证明了最优平均值的收敛性和平均最优马氏策略的存在性.其次,给出了ε(>0)-平均最优马氏策略的滚动式算法.  相似文献
3.
本文考虑的是非平稳MDP的期望平均准则,在弱遍历条件下,用概率及鞅论的方法证明了。∈(0)-最优马氏策略的存在性,作为特例,较好地解决了Feinberg和Park在1994年提及的开问题.  相似文献
4.
This paper concerns the construction and regularity of a transition (probability) function of a non-homogeneous continuous-time Markov process with given transition rates and a general state space. Motivating from a lot of restriction in applications of a transition function with continuous (in t≥0) and conservative transition rates q(t, x, Λ), we consider the case that q(t,x,Λ) are only required to satisfy a mild measurability (in t≥0) condition, which is a generalization of the continuity condition. Under the measurability condition we construct a transition function with the given transition rates, provide a necessary and sufficient condition for it to be regular, and further obtain some interesting additional results.  相似文献
5.
This paper considers a first passage model for discounted semi-Markov decision processes with denumerable states and nonnegative costs.The criterion to be optimized is the expected discounted cost incurred during a first passage time to a given target set.We first construct a semi-Markov decision process under a given semi-Markov decision kernel and a policy.Then,we prove that the value function satisfies the optimality equation and there exists an optimal(or e-optimal) stationary policy under suitable conditions by using a minimum nonnegative solution approach.Further we give some properties of optimal policies.In addition,a value iteration algorithm for computing the value function and optimal policies is developed and an example is given.Finally,it is showed that our model is an extension of the first passage models for both discrete-time and continuous-time Markov decision processes.  相似文献
6.
In this paper we study zero-sum stochastic games. The optimality criterion is the long-run expected average criterion, and the payoff function may have neither upper nor lower bounds. We give a new set of conditions for the existence of a value and a pair of optimal stationary strategies. Our conditions are slightly weaker than those in the previous literature, and some new sufficient conditions for the existence of a pair of optimal stationary strategies are imposed on the primitive data of the model. Our results are illustrated with a queueing system, for which our conditions are satisfied but some of the conditions in some previous literatures fail to hold.  相似文献
7.
This work develops near-optimal controls for systems given by differential equations with wideband noise and random switching. The random switching is modeled by a continuous-time, time-inhomogeneous Markov chain. Under broad conditions, it is shown that there is an associated limit problem, which is a switching jump diffusion. Using near-optimal controls of the limit system, we then build controls for the original systems. It is shown that such constructed controls are nearly optimal.  相似文献
1
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号