期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Blackwell Optimality in the Class of Markov Policies for Continuous-Time Controlled Markov Chains

Tomás Prieto-Rumeau 《Acta Appl Math》2006,92(1):77-96

This paper deals with Blackwell optimality for continuous-time controlled Markov chains with compact Borel action space, and possibly unbounded reward (or cost) rates and unbounded transition rates. We prove the existence of a deterministic stationary policy which is Blackwell optimal in the class of all admissible (nonstationary) Markov policies, thus extending previous results that analyzed Blackwell optimality in the class of stationary policies. We compare our assumptions to the corresponding ones for discrete-time Markov controlled processes. 相似文献

2.

Ergodic Control of a Singularly Perturbed Markov Process in Discrete Time with General State and Compact Action Spaces

T. R. Bielecki L. Stettner 《Applied Mathematics and Optimization》1998,38(3):261-281

Ergodic control of singularly perturbed Markov chains with general state and compact action spaces is considered. A new method is given for characterization of the limit of invariant measures, for perturbed chains, when the perturbation parameter goes to zero. It is also demonstrated that the limit control principle is satisfied under natural ergodicity assumptions about controlled Markov chains. These assumptions allow for the presence of transient states, a situation that has not been considered in the literature before in the context of control of singularly perturbed Markov processes with long-run-average cost functionals. Accepted 3 December 1996 相似文献

3.

Talagrand Inequality on Free Path Space and Application to Stochastic Reaction Diffusion Equations

Feng-yu WANG Tu-sheng ZHANG 《应用数学学报(英文版)》2020,36(2):253-261

By using a split argument due to[1],the transportation cost inequality is established on the free path space of Markov processes.The general result is applied to stochastic reaction diffusion equations with random initial values. 相似文献

4.

Optimal Control of Markov Processes with Age-Dependent Transition Rates

Mrinal K. Ghosh Subhamay Saha 《Applied Mathematics and Optimization》2012,66(2):257-271

We study optimal control of Markov processes with age-dependent transition rates. The control policy is chosen continuously over time based on the state of the process and its age. We study infinite horizon discounted cost and infinite horizon average cost problems. Our approach is via the construction of an equivalent semi-Markov decision process. We characterise the value function and optimal controls for both discounted and average cost cases. 相似文献

5.

泛函不等式及其应用 总被引：1，自引：0，他引：1

王凤雨《数学进展》2003,32(5):513-528

本文介绍有关泛函不等式及谱理论与马氏过程研究的若干新进展，我们首先简要回顾了两个著名不等式，即Poincare不等式与对数不等式，然后分别使用泛函不等式研究本征谱、马氏半群的收敛速度和运费不等式．相似文献

6.

Long-term average cost control problems for continuous time Markov processes: A survey

Maurice Robin 《Acta Appl Math》1983,1(3):281-299

This paper addresses the long-term average cost control of continuous time Markov processes. A survey of problems and methods contained in various works is given for continuous control, optimal stopping, and impulse control. 相似文献

7.

具有平均费用的非平稳Markov决策过程

魏力仁《经济数学》1995,(1)

本文研究了在一般状态空间具有平均费用的非平稳Ｍａｒｋｏｖ决策过程，把在平稳情形用补充的折扣模型的最优方程来建立平均费用的最优方程的结果，推广到非平稳的情形．利用这个结果证明了最优策略的存在性．相似文献

8.

Models and algorithms for skip-free Markov decision processes on trees

Edmund J Collins 《The Journal of the Operational Research Society》2015,66(10):1595-1604

We introduce a class of models for multidimensional control problems that we call skip-free Markov decision processes on trees. We describe and analyse an algorithm applicable to Markov decision processes of this type that are skip-free in the negative direction. Starting with the finite average cost case, we show that the algorithm combines the advantages of both value iteration and policy iteration—it is guaranteed to converge to an optimal policy and optimal value function after a finite number of iterations but the computational effort required for each iteration step is comparable with that for value iteration. We show that the algorithm can also be used to solve discounted cost models and continuous-time models, and that a suitably modified algorithm can be used to solve communicating models. 相似文献

9.

On Risk-Sensitive Ergodic Impulsive Control of Markov Processes

R. Sadowy L. Stettner 《Applied Mathematics and Optimization》2002,45(1):45-61

Impulsive control of continuous-time Markov processes with risk- sensitive long-run average cost is considered. The most general impulsive control problem is studied under the restriction that impulses are in dyadic moments only. In a particular case of additive cost for impulses, the impulsive control problem is solved without restrictions on the moments of impulses. Accepted 30 April 2001. Online publication 29 August 2001. 相似文献

10.

Kolmogorov forward equation and explosiveness in countable state Markov processes

F. M. Spieksma 《Annals of Operations Research》2016,241(1-2):3-22

相似文献

11.

非一致有界费用MDP的强平均最优性条件

肖晴初谭杭生《运筹学学报》2010,14(1):95-105

研究可数状态空间任意行动空间非一致性有界费用马氏决策过程(MDP)的强平均最优,给出了使得每个常用的平均最优策略也是强平均最优的条件,并实质性的推广了Cavazos-Cadena和Fernandez-Gaucheran(Math. Meth. Oper. Res., 1996, 43: 281-300)的主要结果. 相似文献

12.

Average cost Markov decision processes under the hypothesis of Doeblin

Masami Kurano 《Annals of Operations Research》1991,29(1):375-385

Average cost Markov decision processes (MDPs) with compact state and action spaces and bounded lower semicontinuous cost functions are considered. Kurano [7] has treated the general case in which several ergodic classes and a transient set are permitted for the Markov process induced by any randomized stationary policy under the hypothesis of Doeblin and showed the existence of a minimum pair of state and policy. This paper considers the same case as that discussed in Kurano [7] and proves some new results which give the existence theorem of an optimal stationary policy under some reasonable conditions. 相似文献

13.

Nonstationary Continuous-Time Markov Control Processes with Discounted Costs on Infinite Horizon

Onésimo Hernández-Lerma T. E. Govindan 《Acta Appl Math》2001,67(3):277-293

This paper concerns nonstationary continuous-time Markov control processes on Polish spaces, with the infinite-horizon discounted cost criterion. Necessary and sufficient conditions are given for a control policy to be optimal and asymptotically optimal. In addition, under suitable hypotheses, it is shown that the successive approximation procedure converges in the sense that the sequence of finite-horizon optimal cost functions and the corresponding optimal control policies both converge. 相似文献

14.

On the optimality equation for average cost Markov control processes with Feller transition probabilities

Anna Ja?kiewicz Andrzej S. Nowak 《Journal of Mathematical Analysis and Applications》2006,316(2):495-509

We consider Markov control processes with Borel state space and Feller transition probabilities, satisfying some generalized geometric ergodicity conditions. We provide a new theorem on the existence of a solution to the average cost optimality equation. 相似文献

15.

Existence of Risk-Sensitive Optimal Stationary Policies for Controlled Markov Processes

D. Hernández-Hernández S. I. Marcus 《Applied Mathematics and Optimization》1999,40(3):273-285

In this paper we are concerned with the existence of optimal stationary policies for infinite-horizon risk-sensitive Markov control processes with denumerable state space, unbounded cost function, and long-run average cost. Introducing a discounted cost dynamic game, we prove that its value function satisfies an Isaacs equation, and its relationship with the risk-sensitive control problem is studied. Using the vanishing discount approach, we prove that the risk-sensitive dynamic programming inequality holds, and derive an optimal stationary policy. Accepted 1 October 1997 相似文献

16.

Stable recursive procedures for numerical computations in markov models

S. Stidham Jr. 《Annals of Operations Research》1987,8(1):27-40

相似文献

17.

Markov decision processes under observability constraints

Yasemin Serin Vidyadhar Kulkarni 《Mathematical Methods of Operations Research》2005,61(2):311-328

相似文献

18.

Nonstationary denumerable state Markov decision processes – with average variance criterion

Xianping Guo 《Mathematical Methods of Operations Research》1999,49(1):87-96

In this paper, we consider the nonstationary Markov decision processes (MDP, for short) with average variance criterion on a countable state space, finite action spaces and bounded one-step rewards. From the optimality equations which are provided in this paper, we translate the average variance criterion into a new average expected cost criterion. Then we prove that there exists a Markov policy, which is optimal in an original average expected reward criterion, that minimizies the average variance in the class of optimal policies for the original average expected reward criterion. 相似文献

19.

Non-randomized policies for constrained Markov decision processes

Richard C. Chen Eugene A. Feinberg 《Mathematical Methods of Operations Research》2007,66(1):165-179

This paper addresses constrained Markov decision processes, with expected discounted total cost criteria, which are controlled by non-randomized policies. A dynamic programming approach is used to construct optimal policies. The convergence of the series of finite horizon value functions to the infinite horizon value function is also shown. A simple example illustrating an application is presented. 相似文献

20.

Discounted cost optimality problem: stability with respect to weak metrics

Evgueni Gordienko Enrique Lemus-Rodríguez Raúl Montes-de-Oca 《Mathematical Methods of Operations Research》2008,68(1):77-96

We find inequalities to estimate the stability (robustness) of a discounted cost optimization problem for discrete-time Markov control processes on a Borel state space. The one stage cost is allowed to be unbounded. Unlike the known results in this area we consider a perturbation of transition probabilities measured by the Kantorovich metric, closely related to the weak convergence. The results obtained make possible to estimate the vanishing rate of the stability index when approximation is made through empirical measures. 相似文献