期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Variance-minimization of Markov control processes with pathwise constraints

《Optimization》2012,61(12):1427-1447

This article is concerned with the limiting average variance for discrete-time Markov control processes in Borel spaces, subject to pathwise constraints. Under suitable hypotheses we show that within the class of deterministic stationary optimal policies for the pathwise constrained problem, there exists one with a minimal variance. 相似文献

2.

Stopped Markov decision processes with multiple constraints

Masayuki Horiguchi 《Mathematical Methods of Operations Research》2001,54(3):455-469

相似文献

3.

Markov decision processes under observability constraints

Yasemin Serin Vidyadhar Kulkarni 《Mathematical Methods of Operations Research》2005,61(2):311-328

相似文献

4.

Markov control processes with randomized discounted cost

Juan González-Hernández Raquiel R. López-Martínez J. Rubén Pérez-Hernández 《Mathematical Methods of Operations Research》2007,65(1):27-44

In this paper we consider Markov Decision Processes with discounted cost and a random rate in Borel spaces. We establish the dynamic programming algorithm in finite and infinity horizon cases. We provide conditions for the existence of measurable selectors. And we show an example of consumption-investment problem. This research was partially supported by the PROMEP grant 103.5/05/40. 相似文献

5.

First passage Markov decision processes with constraints and varying discount factors

Xiao WU Xiaolong ZOU Xianping GUO 《Frontiers of Mathematics in China》2015,10(4):1005

This paper focuses on the constrained optimality problem (COP) of first passage discrete-time Markov decision processes (DTMDPs) in denumerable state and compact Borel action spaces with multi-constraints, state-dependent discount factors, and possibly unbounded costs. By means of the properties of a so-called occupation measure of a policy, we show that the constrained optimality problem is equivalent to an (infinite-dimensional) linear programming on the set of occupation measures with some constraints, and thus prove the existence of an optimal policy under suitable conditions. Furthermore, using the equivalence between the constrained optimality problem and the linear programming, we obtain an exact form of an optimal policy for the case of finite states and actions. Finally, as an example, a controlled queueing system is given to illustrate our results. 相似文献

6.

Convergence of Markov decision processes with constraints and state-action dependent discount factors

Wu Xiao Guo Xianping 《中国科学数学(英文版)》2020,63(1):167-182

This paper is concerned with the convergence of a sequence of discrete-time Markov decision processes(DTMDPs)with constraints,state-action dependent discount factors,and possibly unbounded costs.Using the convex analytic approach under mild conditions,we prove that the optimal values and optimal policies of the original DTMDPs converge to those of the"limit"one.Furthermore,we show that any countablestate DTMDP can be approximated by a sequence of finite-state DTMDPs,which are constructed using the truncation technique.Finally,we illustrate the approximation by solving a controlled queueing system numerically,and give the corresponding error bound of the approximation. 相似文献

7.

Optimal control of stationary Markov processes

R. Morton 《Stochastic Processes and their Applications》1973,1(3):237-249

Sufficient conditions are given for the optimal control of Markov processes when the control policy is stationary and the process possesses a stationary distribution. The costs are unbounded and additive, and may or may not be discounted. Applications to Semi-Markov processes are included, and the results for random walks are related to the author's previous papers on diffusion processes. 相似文献

8.

MARKOV DECISION PROGRAMMING WITH CONSTRAINTS 总被引：1，自引：0，他引：1

刘建庸刘克《应用数学学报(英文版)》1994,10(1):1-11

ＭＡＲＫＯＶＤＥＣＩＳＩＯＮＰＲＯＧＲＡＭＭＩＮＧＷＩＴＨＣＯＮＳＴＲＡＩＮＴＳＬＩＵＪＩＡＮＹＯＮＧ（刘建庸）；ＬＩＵＫＥ（刘克）（ＩｎｓｔｉｔｕｔｅｏｆＡｐｐｌｉｅｄＭａｔｈｅｍａｔｉｃｓ，ｔｈｅＣｈｉｎｅｓｅＡｃａｄｅｍｙｏｆＳｃｉｅｎｃｅｓ，... 相似文献

9.

Markov programming with policy constraints

N.A.J. Hastings D. Sadjadi 《European Journal of Operational Research》1979,3(3):253-255

In the theory and applications of Markov decision processes introduced by Howard and subsequently developed by many authors, it is assumed that actions can be chosen independently at each state. A policy constrained Markov decision process is one where selecting a given action in one state restricts the choice of actions in another. This note describes a method for determining a maximal gain policy in the policy constrained case. The method involves the use of bounds on the gain of the feasible policies to produce a policy ranking list. This list then forms a basis for a bounded enumeration procedure which yields the optimal policy. 相似文献

10.

Optimal ergodic control of Markov diffusion processes with minimum variance

《Stochastics An International Journal of Probability and Stochastic Processes》2013,85(6):929-945

In this paper, we study the optimal ergodic control problem with minimum variance for a general class of controlled Markov diffusion processes. To this end, we follow a lexicographical approach. Namely, we first identify the class of average optimal control policies, and then within this class, we search policies that minimize the limiting average variance. To do this, a key intermediate step is to show that the limiting average variance is a constant independent of the initial state. Our proof of this latter fact gives a result stronger than the central limit theorem for diffusions. An application to manufacturing systems illustrates our results. 相似文献

11.

Multicriteria impulsive control of jump Markov processes

A. B.?Piunovskiy Email author 《Mathematical Methods of Operations Research》2004,60(1):125-144

相似文献

12.

Optimal stopping with continuous control of piecewise deterministic Markov processes

《Stochastics An International Journal of Probability and Stochastic Processes》2013,85(1-2):41-73

In this paper we consider the problem of optimal stopping and continuous control on some local parameters of a piecewise-deterministic Markov processes (PDP's). Optimality equations are obtained in terms of a set of variational inequalities as well as on the first jump time operator of the PDP. It is shown that if the final cost function is absolutely continuous along trajectories then so is the value function of the optimal stopping problem with continuous control. These results unify and generalize previous ones in the current literature. 相似文献

13.

Convergence of controlled models and finite-state approximation for discounted continuous-time Markov decision processes with constraints

Xianping Guo Wenzhao Zhang 《European Journal of Operational Research》2014

相似文献

14.

Optimal impulsive control of piecewise deterministic Markov processes

F. Dufour M. Horiguchi A. B. Piunovskiy 《Stochastics An International Journal of Probability and Stochastic Processes》2016,88(7):1073-1098

In this paper, we study the infinite-horizon expected discounted continuous-time optimal control problem for Piecewise Deterministic Markov Processes with both impulsive and gradual (also called continuous) controls. The set of admissible control strategies is supposed to be formed by policies possibly randomized and depending on the past-history of the process. We assume that the gradual control acts on the jump intensity and on the transition measure, but not on the flow. The so-called Hamilton–Jacobi–Bellman (HJB) equation associated to this optimization problem is analyzed. We provide sufficient conditions for the existence of a solution to the HJB equation and show that the solution is in fact unique and coincides with the value function of the control problem. Moreover, the existence of an optimal control strategy is proven having the property to be stationary and non-randomized. 相似文献

15.

Optimal control in light traffic Markov decision processes

Ger Koole Olaf Passchier 《Mathematical Methods of Operations Research》1997,45(1):63-79

We consider Markov Decision Processes under light traffic conditions. We develop an algorithm to obtain asymptotically optimal policies for both the total discounted and the average cost criterion. This gives a general framework for several light traffic results in the literature. We illustrate the method by deriving the asymptotically optimal control of a simple ATM network. 相似文献

16.

Markov processes with free-Meixner laws

Włodek Bryc 《Stochastic Processes and their Applications》2010

We study a time-non-homogeneous Markov process which arose from free probability, and which also appeared in the study of stochastic processes with linear regressions and quadratic conditional variances. Our main result is the explicit expression for the generator of the (non-homogeneous) transition operator acting on functions that extend analytically to complex domains. 相似文献

17.

A pathwise approach to the extinction of branching processes with countably many types

Peter Braunsteins Geoffrey Decrouez Sophie Hautphenne 《Stochastic Processes and their Applications》2019,129(3):713-739

We consider the extinction events of Galton–Watson processes with countably infinitely many types. In particular, we construct truncated and augmented Galton–Watson processes with finite but increasing sets of types. A pathwise approach is then used to show that, under some sufficient conditions, the corresponding sequence of extinction probability vectors converges to the global extinction probability vector of the Galton–Watson process with countably infinitely many types. Besides giving rise to a family of new iterative methods for computing the global extinction probability vector, our approach paves the way to new global extinction criteria for branching processes with countably infinitely many types. 相似文献

18.

Adaptive control of Markov processes with incomplete state information and unknown parameters

O. Hernandez-Lerma S. I. Marcus 《Journal of Optimization Theory and Applications》1987,52(2):227-241

Recent results for parameter-adaptive Markov decision processes (MDP's) are extended to partially observed MDP's depending on unknown parameters. These results include approximations converging uniformly to the optimal reward function and asymptotically optimal adaptive policies.This research was supported in part by the Consejo del Sistema Nacional de Educación Tecnologica (COSNET) under Grant 178/84, in part by the Air Force Office of Scientific Research under Grant AFOSR-84-0089, in part by the National Science Foundation under Grant ECS-84-12100, and in part by the Joint Services Electronics Program under Contract F49602-82-C-0033. 相似文献

19.

Recursive adaptive control of Markov decision processes with the average reward criterion

Rolando Cavazos-Cadena Onésimo Hernández-Lerma 《Applied Mathematics and Optimization》1991,23(1):193-207

We are concerned with Markov decision processes with Borel state and action spaces; the transition law and the reward function depend on anunknown parameter. In this framework, we study therecursive adaptive nonstationary value iteration policy, which is proved to be optimal under thesame conditions usually imposed to obtain the optimality of other well-knownnonrecursive adaptive policies. The results are illustrated by showing the existence of optimal adaptive policies for a class of additive-noise systems with unknown noise distribution.This research was supported in part by the Consejo Nacional de Ciencia y Tecnología under Grants PCEXCNA-050156 and A128CCOEO550, and in part by the Third World Academy of Sciences under Grant TWAS RG MP 898-152. 相似文献

20.

Adaptive control for discrete-time Markov processes with unbounded costs: Average criterion

Evgueni I. Gordienko J. Adolfo Minjárez-Sosa 《Mathematical Methods of Operations Research》1998,48(1):37-55

The paper deals with a class of discrete-time Markov control processes with Borel state and action spaces, and possibly unbounded one-stage costs. The processes are given by recurrent equations x _t ₊₁=F(x _t,a _t,ξ_t), t=1,2,… with i.i.d. ℜ^k– valued random vectors ξ_t whose density ρ is unknown. Assuming observability of ξ_t, and taking advantage of the procedure of statistical estimation of ρ used in a previous work by authors, we construct an average cost optimal adaptive policy. Received March/Revised version October 1997 相似文献