首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
This paper deals with Blackwell optimality for continuous-time controlled Markov chains with compact Borel action space, and possibly unbounded reward (or cost) rates and unbounded transition rates. We prove the existence of a deterministic stationary policy which is Blackwell optimal in the class of all admissible (nonstationary) Markov policies, thus extending previous results that analyzed Blackwell optimality in the class of stationary policies. We compare our assumptions to the corresponding ones for discrete-time Markov controlled processes.  相似文献   

2.
Ergodic control of singularly perturbed Markov chains with general state and compact action spaces is considered. A new method is given for characterization of the limit of invariant measures, for perturbed chains, when the perturbation parameter goes to zero. It is also demonstrated that the limit control principle is satisfied under natural ergodicity assumptions about controlled Markov chains. These assumptions allow for the presence of transient states, a situation that has not been considered in the literature before in the context of control of singularly perturbed Markov processes with long-run-average cost functionals. Accepted 3 December 1996  相似文献   

3.
By using a split argument due to[1],the transportation cost inequality is established on the free path space of Markov processes.The general result is applied to stochastic reaction diffusion equations with random initial values.  相似文献   

4.
We study optimal control of Markov processes with age-dependent transition rates. The control policy is chosen continuously over time based on the state of the process and its age. We study infinite horizon discounted cost and infinite horizon average cost problems. Our approach is via the construction of an equivalent semi-Markov decision process. We characterise the value function and optimal controls for both discounted and average cost cases.  相似文献   

5.
泛函不等式及其应用   总被引:1,自引:0,他引:1  
王凤雨 《数学进展》2003,32(5):513-528
本文介绍有关泛函不等式及谱理论与马氏过程研究的若干新进展,我们首先简要回顾了两个著名不等式,即Poincare不等式与对数不等式,然后分别使用泛函不等式研究本征谱、马氏半群的收敛速度和运费不等式.  相似文献   

6.
This paper addresses the long-term average cost control of continuous time Markov processes. A survey of problems and methods contained in various works is given for continuous control, optimal stopping, and impulse control.  相似文献   

7.
本文研究了在一般状态空间具有平均费用的非平稳Markov决策过程,把在平稳情形用补充的折扣模型的最优方程来建立平均费用的最优方程的结果,推广到非平稳的情形.利用这个结果证明了最优策略的存在性.  相似文献   

8.
We introduce a class of models for multidimensional control problems that we call skip-free Markov decision processes on trees. We describe and analyse an algorithm applicable to Markov decision processes of this type that are skip-free in the negative direction. Starting with the finite average cost case, we show that the algorithm combines the advantages of both value iteration and policy iteration—it is guaranteed to converge to an optimal policy and optimal value function after a finite number of iterations but the computational effort required for each iteration step is comparable with that for value iteration. We show that the algorithm can also be used to solve discounted cost models and continuous-time models, and that a suitably modified algorithm can be used to solve communicating models.  相似文献   

9.
Impulsive control of continuous-time Markov processes with risk- sensitive long-run average cost is considered. The most general impulsive control problem is studied under the restriction that impulses are in dyadic moments only. In a particular case of additive cost for impulses, the impulsive control problem is solved without restrictions on the moments of impulses. Accepted 30 April 2001. Online publication 29 August 2001.  相似文献   

10.
11.
研究可数状态空间任意行动空间非一致性有界费用马氏决策过程(MDP)的强平均最优,给出了使得每个常用的平均最优策略也是强平均最优的条件,并实质性的推广了Cavazos-Cadena和Fernandez-Gaucheran(Math. Meth. Oper. Res., 1996, 43: 281-300)的主要结果.  相似文献   

12.
Average cost Markov decision processes (MDPs) with compact state and action spaces and bounded lower semicontinuous cost functions are considered. Kurano [7] has treated the general case in which several ergodic classes and a transient set are permitted for the Markov process induced by any randomized stationary policy under the hypothesis of Doeblin and showed the existence of a minimum pair of state and policy. This paper considers the same case as that discussed in Kurano [7] and proves some new results which give the existence theorem of an optimal stationary policy under some reasonable conditions.  相似文献   

13.
This paper concerns nonstationary continuous-time Markov control processes on Polish spaces, with the infinite-horizon discounted cost criterion. Necessary and sufficient conditions are given for a control policy to be optimal and asymptotically optimal. In addition, under suitable hypotheses, it is shown that the successive approximation procedure converges in the sense that the sequence of finite-horizon optimal cost functions and the corresponding optimal control policies both converge.  相似文献   

14.
We consider Markov control processes with Borel state space and Feller transition probabilities, satisfying some generalized geometric ergodicity conditions. We provide a new theorem on the existence of a solution to the average cost optimality equation.  相似文献   

15.
In this paper we are concerned with the existence of optimal stationary policies for infinite-horizon risk-sensitive Markov control processes with denumerable state space, unbounded cost function, and long-run average cost. Introducing a discounted cost dynamic game, we prove that its value function satisfies an Isaacs equation, and its relationship with the risk-sensitive control problem is studied. Using the vanishing discount approach, we prove that the risk-sensitive dynamic programming inequality holds, and derive an optimal stationary policy. Accepted 1 October 1997  相似文献   

16.
17.
18.
In this paper, we consider the nonstationary Markov decision processes (MDP, for short) with average variance criterion on a countable state space, finite action spaces and bounded one-step rewards. From the optimality equations which are provided in this paper, we translate the average variance criterion into a new average expected cost criterion. Then we prove that there exists a Markov policy, which is optimal in an original average expected reward criterion, that minimizies the average variance in the class of optimal policies for the original average expected reward criterion.  相似文献   

19.
This paper addresses constrained Markov decision processes, with expected discounted total cost criteria, which are controlled by non-randomized policies. A dynamic programming approach is used to construct optimal policies. The convergence of the series of finite horizon value functions to the infinite horizon value function is also shown. A simple example illustrating an application is presented.  相似文献   

20.
We find inequalities to estimate the stability (robustness) of a discounted cost optimization problem for discrete-time Markov control processes on a Borel state space. The one stage cost is allowed to be unbounded. Unlike the known results in this area we consider a perturbation of transition probabilities measured by the Kantorovich metric, closely related to the weak convergence. The results obtained make possible to estimate the vanishing rate of the stability index when approximation is made through empirical measures.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号