首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 12 毫秒
1.
A large class of continuous parameter jump decision processes is considered. Pontryagin's Maximum Principle is used to derive a necessary condition for optimality. An optimal strategy may frequently be obtained explicitly.  相似文献   

2.
We give two simple axioms that characterize a simple functional form for aggregation of column stochastic matrices (i.e., Markov processes). Several additional observations are made about such aggregation, including the special case in which the aggregated process is Markovian relative to the original one.  相似文献   

3.
We introduce a revised simplex algorithm for solving a typical type of dynamic programming equation arising from a class of finite Markov decision processes. The algorithm also applies to several types of optimal control problems with diffusion models after discretization. It is based on the regular simplex algorithm, the duality concept in linear programming, and certain special features of the dynamic programming equation itself. Convergence is established for the new algorithm. The algorithm has favorable potential applicability when the number of actions is very large or even infinite.  相似文献   

4.
We consider a discrete-time constrained Markov decision process under the discounted cost optimality criterion. The state and action spaces are assumed to be Borel spaces, while the cost and constraint functions might be unbounded. We are interested in approximating numerically the optimal discounted constrained cost. To this end, we suppose that the transition kernel of the Markov decision process is absolutely continuous with respect to some probability measure μ  . Then, by solving the linear programming formulation of a constrained control problem related to the empirical probability measure μnμn of μ, we obtain the corresponding approximation of the optimal constrained cost. We derive a concentration inequality which gives bounds on the probability that the estimation error is larger than some given constant. This bound is shown to decrease exponentially in n. Our theoretical results are illustrated with a numerical application based on a stochastic version of the Beverton–Holt population model.  相似文献   

5.
We study stochastic control problem for pure jump processes on a general state space with risk sensitive discounted and ergodic cost criteria. For the discounted cost criterion we prove the existence and Hamilton–Jacobi–Bellman characterization of optimal α-discounted control for bounded cost function. For the ergodic cost criterion we assume a Lyapunov type stability assumption and a small cost condition. Under these assumptions we show the existence of the optimal risk-sensitive ergodic control.  相似文献   

6.
The Markov decision process is studied under the maximization of the probability that total discounted rewards exceed a target level. We focus on and study the dynamic programing equations of the model. We give various properties of the optimal return operator and, for the infinite planning-horizon model, we characterize the optimal value function as a maximal fixed point of the previous operator. Various turnpike results relating the finite and infinite-horizon models are also given.  相似文献   

7.
We study controlled Markov processes where multiple decisions need to be made for each state. We present conditions on the cost structure and the state transition mechanism of the process under which optimal decisions are restricted to a subset of the decision space. As a result, the numerical computation of the optimal policy may be significantly expedited.  相似文献   

8.
This paper is a continuation of our earlier paper (J. Theoret. Prob.3, 51–70). The existence and uniqueness of solutions of equations for quadric stochastic processes will be studied in this paper.  相似文献   

9.
An extension of the work of P. Mandl concerning the optimal control of time-homogeneous diffusion processes in one dimension is given. Instead of a classical second order differential operator as infinitesimal generator, Feller's generalized differential operator DmD+p with a possibly nondecreasing weight function m is used. In this manner an optimal control of a wider class of one dimensional Marcov processes-including diffusions as well as birth and death processes-is realized.  相似文献   

10.
This note addresses the time aggregation approach to ergodic finite state Markov decision processes with uncontrollable states. We propose the use of the time aggregation approach as an intermediate step toward constructing a transformed MDP whose state space is comprised solely of the controllable states. The proposed approach simplifies the iterative search for the optimal solution by eliminating the need to define an equivalent parametric function, and results in a problem that can be solved by simpler, standard MDP algorithms.  相似文献   

11.
The control of piecewise-deterministic processes is studied where only local boundedness of the data is assumed. Moreover the discount rate may be zero. The value function is shown to be solution to the Bellman equation in a weak sense; however the solution concept is strong enough to generate optimal policies. Continuity and compactness conditions are given for the existence of nonrelaxed optimal feedback controls.  相似文献   

12.
《Optimization》2012,61(5):651-670
Optimality problems in infinite horizon, discrete time, vector criterion Markov and semi-Markov decision processes are expressed as standard problems of multiobjective linear programming. Processes with discounting, absorbing processes and completely ergodie processes without discounting are investigated. The common properties and special structure of derived multiobjective linear programming problems are overviewed. Computational simplicities associated with these problems in comparison with general multiobjective linear programming problems are discussed. Methods for solving these problems are overviewed and simple numerical examples are given.  相似文献   

13.
Quadratic assignment problems   总被引:1,自引:0,他引:1  
This paper surveys quadratic assignment problems (QAP). At first several applications of this problem class are described and mathematical formulations of QAPs are given. Then some exact solution methods and good heuristics are outlined. Their computational behaviour is illustrated by numerical results. Further recent results on the asymptotic probabilistic behaviour of QAPs are outlined.  相似文献   

14.
This paper addresses constrained Markov decision processes, with expected discounted total cost criteria, which are controlled by non-randomized policies. A dynamic programming approach is used to construct optimal policies. The convergence of the series of finite horizon value functions to the infinite horizon value function is also shown. A simple example illustrating an application is presented.  相似文献   

15.
Value iteration and optimization of multiclass queueing networks   总被引:2,自引:0,他引:2  
Chen  Rong-Rong  Meyn  Sean 《Queueing Systems》1999,32(1-3):65-97
This paper considers in parallel the scheduling problem for multiclass queueing networks, and optimization of Markov decision processes. It is shown that the value iteration algorithm may perform poorly when the algorithm is not initialized properly. The most typical case where the initial value function is taken to be zero may be a particularly bad choice. In contrast, if the value iteration algorithm is initialized with a stochastic Lyapunov function, then the following hold: (i) a stochastic Lyapunov function exists for each intermediate policy, and hence each policy is regular (a strong stability condition), (ii) intermediate costs converge to the optimal cost, and (iii) any limiting policy is average cost optimal. It is argued that a natural choice for the initial value function is the value function for the associated deterministic control problem based upon a fluid model, or the approximate solution to Poisson’s equation obtained from the LP of Kumar and Meyn. Numerical studies show that either choice may lead to fast convergence to an optimal policy. This revised version was published online in June 2006 with corrections to the Cover Date.  相似文献   

16.
A general problem in relation to application of Markov decision processes to real world problems is the curse of dimensionality, since the size of the state space grows to prohibitive levels when information on all relevant traits of the system being modeled are included. In herd management, we face a hierarchy of decisions made at different levels with different time horizons, and the decisions made at different levels are mutually dependent. Furthermore, decisions have to be made without certainty about the future state of the system. These aspects contribute even further to the dimensionality problem. A new notion of a multilevel hierarchic Markov process specially designed to solve dynamic decision problems involving decisions with varying time horizon has been presented. The method contributes significantly to circumvent the curse of dimensionality, and it provides a framework for general herd management support instead of very specialized models only concerned with a single decision as, for instance, replacement. The applicational perspectives of the technique are illustrated by potential examples relating to the management of a sow herd and a dairy herd.  相似文献   

17.
The problem of characterizing the minimum perturbations to parameters in future stages of a discrete dynamic program necessary to change the optimal first policy is considered. Lower bounds on these perturbations are derived and used to establish ranges for the reward functions over which the optimal first policy is robust. A numerical example is presented to illustrate factors affecting the tightness of these bounds.  相似文献   

18.
In this paper we consider Markov Decision Processes with discounted cost and a random rate in Borel spaces. We establish the dynamic programming algorithm in finite and infinity horizon cases. We provide conditions for the existence of measurable selectors. And we show an example of consumption-investment problem. This research was partially supported by the PROMEP grant 103.5/05/40.  相似文献   

19.
20.
《Optimization》2012,61(9):1133-1150
This article presents a new method of linear programming (LP) for solving Markov decision processes (MDPs) based on the simplex method (SM). SM has shown to be the most efficient method in many practical problems; unfortunately, classical SM has an exponential complexity. Therefore, new SMs have emerged for obtaining optimal solutions in the most efficient way. The cosine simplex method (CSM) is one of them. CSM is based on the Karush Kuhn Tucker conditions, and is able to efficiently solve general LP problems. This work presents a new method named the Markov Cosine Simplex Method (MCSM) for solving MDP problems, which is an extension of CSM. In this article, the efficiency of MCSM is compared to the traditional revised simplex method (RSM); experimental results show that MCSM is far more efficient than RSM.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号