共查询到20条相似文献,搜索用时 31 毫秒
1.
Tomás Prieto-Rumeau 《Acta Appl Math》2006,92(1):77-96
This paper deals with Blackwell optimality for continuous-time controlled Markov chains with compact Borel action space, and possibly unbounded reward (or cost) rates and unbounded transition rates. We prove the existence of a deterministic stationary policy which is Blackwell optimal in the class of all admissible (nonstationary) Markov policies, thus extending previous results that analyzed Blackwell optimality in the class of stationary policies. We compare our assumptions to the corresponding ones for discrete-time Markov controlled processes. 相似文献
2.
Ergodic control of singularly perturbed Markov chains with general state and compact action spaces is considered. A new method
is given for characterization of the limit of invariant measures, for perturbed chains, when the perturbation parameter goes
to zero. It is also demonstrated that the limit control principle is satisfied under natural ergodicity assumptions about
controlled Markov chains. These assumptions allow for the presence of transient states, a situation that has not been considered
in the literature before in the context of control of singularly perturbed Markov processes with long-run-average cost functionals.
Accepted 3 December 1996 相似文献
3.
By using a split argument due to[1],the transportation cost inequality is established on the free path space of Markov processes.The general result is applied to stochastic reaction diffusion equations with random initial values. 相似文献
4.
We study optimal control of Markov processes with age-dependent transition rates. The control policy is chosen continuously over time based on the state of the process and its age. We study infinite horizon discounted cost and infinite horizon average cost problems. Our approach is via the construction of an equivalent semi-Markov decision process. We characterise the value function and optimal controls for both discounted and average cost cases. 相似文献
5.
6.
Maurice Robin 《Acta Appl Math》1983,1(3):281-299
This paper addresses the long-term average cost control of continuous time Markov processes. A survey of problems and methods contained in various works is given for continuous control, optimal stopping, and impulse control. 相似文献
7.
本文研究了在一般状态空间具有平均费用的非平稳Markov决策过程,把在平稳情形用补充的折扣模型的最优方程来建立平均费用的最优方程的结果,推广到非平稳的情形.利用这个结果证明了最优策略的存在性. 相似文献
8.
Edmund J Collins 《The Journal of the Operational Research Society》2015,66(10):1595-1604
We introduce a class of models for multidimensional control problems that we call skip-free Markov decision processes on trees. We describe and analyse an algorithm applicable to Markov decision processes of this type that are skip-free in the negative direction. Starting with the finite average cost case, we show that the algorithm combines the advantages of both value iteration and policy iteration—it is guaranteed to converge to an optimal policy and optimal value function after a finite number of iterations but the computational effort required for each iteration step is comparable with that for value iteration. We show that the algorithm can also be used to solve discounted cost models and continuous-time models, and that a suitably modified algorithm can be used to solve communicating models. 相似文献
9.
Impulsive control of continuous-time Markov processes with risk- sensitive long-run average cost is considered. The most
general impulsive control problem is studied under the restriction that impulses are in dyadic moments only. In a particular
case of additive cost for impulses, the impulsive control problem is solved without restrictions on the moments of impulses.
Accepted 30 April 2001. Online publication 29 August 2001. 相似文献
10.
11.
研究可数状态空间任意行动空间非一致性有界费用马氏决策过程(MDP)的强平均最优,给出了使得每个常用的平均最优策略也是强平均最优的条件,并实质性的推广了Cavazos-Cadena和Fernandez-Gaucheran(Math. Meth. Oper. Res., 1996, 43: 281-300)的主要结果. 相似文献
12.
Masami Kurano 《Annals of Operations Research》1991,29(1):375-385
Average cost Markov decision processes (MDPs) with compact state and action spaces and bounded lower semicontinuous cost functions are considered. Kurano [7] has treated the general case in which several ergodic classes and a transient set are permitted for the Markov process induced by any randomized stationary policy under the hypothesis of Doeblin and showed the existence of a minimum pair of state and policy. This paper considers the same case as that discussed in Kurano [7] and proves some new results which give the existence theorem of an optimal stationary policy under some reasonable conditions. 相似文献
13.
This paper concerns nonstationary continuous-time Markov control processes on Polish spaces, with the infinite-horizon discounted cost criterion. Necessary and sufficient conditions are given for a control policy to be optimal and asymptotically optimal. In addition, under suitable hypotheses, it is shown that the successive approximation procedure converges in the sense that the sequence of finite-horizon optimal cost functions and the corresponding optimal control policies both converge. 相似文献
14.
Anna Ja?kiewicz Andrzej S. Nowak 《Journal of Mathematical Analysis and Applications》2006,316(2):495-509
We consider Markov control processes with Borel state space and Feller transition probabilities, satisfying some generalized geometric ergodicity conditions. We provide a new theorem on the existence of a solution to the average cost optimality equation. 相似文献
15.
In this paper we are concerned with the existence of optimal stationary policies for infinite-horizon risk-sensitive Markov
control processes with denumerable state space, unbounded cost function, and long-run average cost. Introducing a discounted
cost dynamic game, we prove that its value function satisfies an Isaacs equation, and its relationship with the risk-sensitive
control problem is studied. Using the vanishing discount approach, we prove that the risk-sensitive dynamic programming inequality
holds, and derive an optimal stationary policy.
Accepted 1 October 1997 相似文献
16.
17.
18.
Xianping Guo 《Mathematical Methods of Operations Research》1999,49(1):87-96
In this paper, we consider the nonstationary Markov decision processes (MDP, for short) with average variance criterion on a countable state space, finite action spaces and bounded one-step rewards. From the optimality equations which are provided in this paper, we translate the average variance criterion into a new average expected cost criterion. Then we prove that there exists a Markov policy, which is optimal in an original average expected reward criterion, that minimizies the average variance in the class of optimal policies for the original average expected reward criterion. 相似文献
19.
This paper addresses constrained Markov decision processes, with expected discounted total cost criteria, which are controlled
by non-randomized policies. A dynamic programming approach is used to construct optimal policies. The convergence of the series
of finite horizon value functions to the infinite horizon value function is also shown. A simple example illustrating an application
is presented. 相似文献
20.
Evgueni Gordienko Enrique Lemus-Rodríguez Raúl Montes-de-Oca 《Mathematical Methods of Operations Research》2008,68(1):77-96
We find inequalities to estimate the stability (robustness) of a discounted cost optimization problem for discrete-time Markov
control processes on a Borel state space. The one stage cost is allowed to be unbounded. Unlike the known results in this
area we consider a perturbation of transition probabilities measured by the Kantorovich metric, closely related to the weak
convergence. The results obtained make possible to estimate the vanishing rate of the stability index when approximation is
made through empirical measures. 相似文献