共查询到20条相似文献,搜索用时 12 毫秒
1.
K.D. Glazebrook 《Stochastic Processes and their Applications》1979,9(1):19-33
A large class of continuous parameter jump decision processes is considered. Pontryagin's Maximum Principle is used to derive a necessary condition for optimality. An optimal strategy may frequently be obtained explicitly. 相似文献
2.
We give two simple axioms that characterize a simple functional form for aggregation of column stochastic matrices (i.e., Markov processes). Several additional observations are made about such aggregation, including the special case in which the aggregated process is Markovian relative to the original one. 相似文献
3.
M. Sun 《Journal of Optimization Theory and Applications》1993,79(2):405-413
We introduce a revised simplex algorithm for solving a typical type of dynamic programming equation arising from a class of finite Markov decision processes. The algorithm also applies to several types of optimal control problems with diffusion models after discretization. It is based on the regular simplex algorithm, the duality concept in linear programming, and certain special features of the dynamic programming equation itself. Convergence is established for the new algorithm. The algorithm has favorable potential applicability when the number of actions is very large or even infinite. 相似文献
4.
We consider a discrete-time constrained Markov decision process under the discounted cost optimality criterion. The state and action spaces are assumed to be Borel spaces, while the cost and constraint functions might be unbounded. We are interested in approximating numerically the optimal discounted constrained cost. To this end, we suppose that the transition kernel of the Markov decision process is absolutely continuous with respect to some probability measure μ . Then, by solving the linear programming formulation of a constrained control problem related to the empirical probability measure μn of μ, we obtain the corresponding approximation of the optimal constrained cost. We derive a concentration inequality which gives bounds on the probability that the estimation error is larger than some given constant. This bound is shown to decrease exponentially in n. Our theoretical results are illustrated with a numerical application based on a stochastic version of the Beverton–Holt population model. 相似文献
5.
Chandan Pal 《Stochastics An International Journal of Probability and Stochastic Processes》2019,91(2):155-174
We study stochastic control problem for pure jump processes on a general state space with risk sensitive discounted and ergodic cost criteria. For the discounted cost criterion we prove the existence and Hamilton–Jacobi–Bellman characterization of optimal α-discounted control for bounded cost function. For the ergodic cost criterion we assume a Lyapunov type stability assumption and a small cost condition. Under these assumptions we show the existence of the optimal risk-sensitive ergodic control. 相似文献
6.
The Markov decision process is studied under the maximization of the probability that total discounted rewards exceed a target level. We focus on and study the dynamic programing equations of the model. We give various properties of the optimal return operator and, for the infinite planning-horizon model, we characterize the optimal value function as a maximal fixed point of the previous operator. Various turnpike results relating the finite and infinite-horizon models are also given. 相似文献
7.
We study controlled Markov processes where multiple decisions need to be made for each state. We present conditions on the cost structure and the state transition mechanism of the process under which optimal decisions are restricted to a subset of the decision space. As a result, the numerical computation of the optimal policy may be significantly expedited. 相似文献
8.
N. N. Ganihodzhaev 《Journal of Theoretical Probability》1991,4(4):639-653
This paper is a continuation of our earlier paper (J. Theoret. Prob.3, 51–70). The existence and uniqueness of solutions of equations for quadric stochastic processes will be studied in this paper. 相似文献
9.
Jürgen Groh 《Stochastic Processes and their Applications》1980,10(3):271-297
An extension of the work of P. Mandl concerning the optimal control of time-homogeneous diffusion processes in one dimension is given. Instead of a classical second order differential operator as infinitesimal generator, Feller's generalized differential operator DmD+p with a possibly nondecreasing weight function m is used. In this manner an optimal control of a wider class of one dimensional Marcov processes-including diffusions as well as birth and death processes-is realized. 相似文献
10.
Edilson F. Arruda 《Operations Research Letters》2011,39(3):193-197
This note addresses the time aggregation approach to ergodic finite state Markov decision processes with uncontrollable states. We propose the use of the time aggregation approach as an intermediate step toward constructing a transformed MDP whose state space is comprised solely of the controllable states. The proposed approach simplifies the iterative search for the optimal solution by eliminating the need to define an equivalent parametric function, and results in a problem that can be solved by simpler, standard MDP algorithms. 相似文献
11.
The control of piecewise-deterministic processes is studied where only local boundedness of the data is assumed. Moreover the discount rate may be zero. The value function is shown to be solution to the Bellman equation in a weak sense; however the solution concept is strong enough to generate optimal policies. Continuity and compactness conditions are given for the existence of nonrelaxed optimal feedback controls. 相似文献
12.
《Optimization》2012,61(5):651-670
Optimality problems in infinite horizon, discrete time, vector criterion Markov and semi-Markov decision processes are expressed as standard problems of multiobjective linear programming. Processes with discounting, absorbing processes and completely ergodie processes without discounting are investigated. The common properties and special structure of derived multiobjective linear programming problems are overviewed. Computational simplicities associated with these problems in comparison with general multiobjective linear programming problems are discussed. Methods for solving these problems are overviewed and simple numerical examples are given. 相似文献
13.
Quadratic assignment problems 总被引:1,自引:0,他引:1
Rainer E. Burkard 《European Journal of Operational Research》1984,15(3):283-289
This paper surveys quadratic assignment problems (QAP). At first several applications of this problem class are described and mathematical formulations of QAPs are given. Then some exact solution methods and good heuristics are outlined. Their computational behaviour is illustrated by numerical results. Further recent results on the asymptotic probabilistic behaviour of QAPs are outlined. 相似文献
14.
This paper addresses constrained Markov decision processes, with expected discounted total cost criteria, which are controlled
by non-randomized policies. A dynamic programming approach is used to construct optimal policies. The convergence of the series
of finite horizon value functions to the infinite horizon value function is also shown. A simple example illustrating an application
is presented. 相似文献
15.
Value iteration and optimization of multiclass queueing networks 总被引:2,自引:0,他引:2
This paper considers in parallel the scheduling problem for multiclass queueing networks, and optimization of Markov decision
processes. It is shown that the value iteration algorithm may perform poorly when the algorithm is not initialized properly.
The most typical case where the initial value function is taken to be zero may be a particularly bad choice. In contrast,
if the value iteration algorithm is initialized with a stochastic Lyapunov function, then the following hold: (i) a stochastic
Lyapunov function exists for each intermediate policy, and hence each policy is regular (a strong stability condition), (ii)
intermediate costs converge to the optimal cost, and (iii) any limiting policy is average cost optimal. It is argued that
a natural choice for the initial value function is the value function for the associated deterministic control problem based
upon a fluid model, or the approximate solution to Poisson’s equation obtained from the LP of Kumar and Meyn. Numerical studies
show that either choice may lead to fast convergence to an optimal policy.
This revised version was published online in June 2006 with corrections to the Cover Date. 相似文献
16.
A general problem in relation to application of Markov decision processes to real world problems is the curse of dimensionality, since the size of the state space grows to prohibitive levels when information on all relevant traits of the system being modeled are included. In herd management, we face a hierarchy of decisions made at different levels with different time horizons, and the decisions made at different levels are mutually dependent. Furthermore, decisions have to be made without certainty about the future state of the system. These aspects contribute even further to the dimensionality problem. A new notion of a multilevel hierarchic Markov process specially designed to solve dynamic decision problems involving decisions with varying time horizon has been presented. The method contributes significantly to circumvent the curse of dimensionality, and it provides a framework for general herd management support instead of very specialized models only concerned with a single decision as, for instance, replacement. The applicational perspectives of the technique are illustrated by potential examples relating to the management of a sow herd and a dairy herd. 相似文献
17.
W. J. Hopp 《Journal of Optimization Theory and Applications》1988,56(2):257-269
The problem of characterizing the minimum perturbations to parameters in future stages of a discrete dynamic program necessary to change the optimal first policy is considered. Lower bounds on these perturbations are derived and used to establish ranges for the reward functions over which the optimal first policy is robust. A numerical example is presented to illustrate factors affecting the tightness of these bounds. 相似文献
18.
Juan González-Hernández Raquiel R. López-Martínez J. Rubén Pérez-Hernández 《Mathematical Methods of Operations Research》2007,65(1):27-44
In this paper we consider Markov Decision Processes with discounted cost and a random rate in Borel spaces. We establish the
dynamic programming algorithm in finite and infinity horizon cases. We provide conditions for the existence of measurable
selectors. And we show an example of consumption-investment problem.
This research was partially supported by the PROMEP grant 103.5/05/40. 相似文献
20.
《Optimization》2012,61(9):1133-1150
This article presents a new method of linear programming (LP) for solving Markov decision processes (MDPs) based on the simplex method (SM). SM has shown to be the most efficient method in many practical problems; unfortunately, classical SM has an exponential complexity. Therefore, new SMs have emerged for obtaining optimal solutions in the most efficient way. The cosine simplex method (CSM) is one of them. CSM is based on the Karush Kuhn Tucker conditions, and is able to efficiently solve general LP problems. This work presents a new method named the Markov Cosine Simplex Method (MCSM) for solving MDP problems, which is an extension of CSM. In this article, the efficiency of MCSM is compared to the traditional revised simplex method (RSM); experimental results show that MCSM is far more efficient than RSM. 相似文献