首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We study risk-sensitive control of continuous time Markov chains taking values in discrete state space. We study both finite and infinite horizon problems. In the finite horizon problem we characterize the value function via Hamilton Jacobi Bellman equation and obtain an optimal Markov control. We do the same for infinite horizon discounted cost case. In the infinite horizon average cost case we establish the existence of an optimal stationary control under certain Lyapunov condition. We also develop a policy iteration algorithm for finding an optimal control.  相似文献   

2.
We consider sequential decision problems over an infinite horizon. The forecast or solution horizon approach to solving such problems requires that the optimal initial decision be unique. We show that multiple optimal initial decisions can exist in general and refer to their existence as degeneracy. We then present a conceptual cost perturbation algorithm for resolving degeneracy and identifying a forecast horizon. We also present a general near-optimal forecast horizon.This material is based on work supported by the National Science Foundation under Grants ECS-8409682 and ECS-8700836.  相似文献   

3.
Workforce capacity planning in human resource management is a critical and essential component of the services supply chain management. In this paper, we consider the planning problem of transferring, hiring, or firing employees among different departments or branches of an organization under an environment of uncertain workforce demands and turnover, with the objective of minimizing the expected cost over a finite planning horizon. We model the problem as a multistage stochastic program and propose a successive convex approximation method which solves the problem in stages and iteratively. An advantage of the method is that it can handle problems of large size where normally solving the problems by equivalent deterministic linear programs is considered to be computationally infeasible. Numerical experiments indicate that solutions obtained by the proposed method have expected costs near optimal.  相似文献   

4.
讨论线性二次最优控制问题, 其随机系统是由 L\'{e}vy 过程驱动的具有随机系数而且还具有仿射项的线性随机微分方程. 伴随方程具有无界系数, 其可解性不是显然的. 利用 $\mathscr{B}\mathscr{M}\mathscr{O}$ 鞅理论, 证明伴随方程在有限 时区解的存在唯一性. 在稳定性条件下, 无限时区的倒向随机 Riccati 微分方程和伴随倒向随机方程的解的存在性是通过对应有限 时区的方程的解来逼近的. 利用这些解能够合成最优控制.  相似文献   

5.
We study the Riccati equation arising in a class of quadratic optimal control problems with infinite dimensional stochastic differential state equation and infinite horizon cost functional. We allow the coefficients, both in the state equation and in the cost, to be random. In such a context backward stochastic Riccati equations are backward stochastic differential equations in the whole positive real axis that involve quadratic non-linearities and take values in a non-Hilbertian space. We prove existence of a minimal non-negative solution and, under additional assumptions, its uniqueness. We show that such a solution allows to perform the synthesis of the optimal control and investigate its attractivity properties. Finally the case where the coefficients are stationary is addressed and an example concerning a controlled wave equation in random media is proposed.  相似文献   

6.
We address the optimal control problem of a very general stochastic hybrid system with both autonomous and impulsive jumps. The planning horizon is infinite and we use the discounted-cost criterion for performance evaluation. Under certain assumptions, we show the existence of an optimal control. We then derive the quasivariational inequalities satisfied by the value function and establish well-posedness. Finally, we prove the usual verification theorem of dynamic programming.  相似文献   

7.
In this paper we consider stopping problems for continuous-time Markov chains under a general risk-sensitive optimization criterion for problems with finite and infinite time horizon. More precisely our aim is to maximize the certainty equivalent of the stopping reward minus cost over the time horizon. We derive optimality equations for the value functions and prove the existence of optimal stopping times. The exponential utility is treated as a special case. In contrast to risk-neutral stopping problems it may be optimal to stop between jumps of the Markov chain. We briefly discuss the influence of the risk sensitivity on the optimal stopping time and consider a special house selling problem as an example.  相似文献   

8.
Decision makers often face the need of performance guarantee with some sufficiently high probability. Such problems can be modelled using a discrete time Markov decision process (MDP) with a probability criterion for the first achieving target value. The objective is to find a policy that maximizes the probability of the total discounted reward exceeding a target value in the preceding stages. We show that our formulation cannot be described by former models with standard criteria. We provide the properties of the objective functions, optimal value functions and optimal policies. An algorithm for computing the optimal policies for the finite horizon case is given. In this stochastic stopping model, we prove that there exists an optimal deterministic and stationary policy and the optimality equation has a unique solution. Using perturbation analysis, we approximate general models and prove the existence of e-optimal policy for finite state space. We give an example for the reliability of the satellite sy  相似文献   

9.
Finite and infinite planning horizon Markov decision problems are formulated for a class of jump processes with general state and action spaces and controls which are measurable functions on the time axis taking values in an appropriate metrizable vector space. For the finite horizon problem, the maximum expected reward is the unique solution, which exists, of a certain differential equation and is a strongly continuous function in the space of upper semi-continuous functions. A necessary and sufficient condition is provided for an admissible control to be optimal, and a sufficient condition is provided for the existence of a measurable optimal policy. For the infinite horizon problem, the maximum expected total reward is the fixed point of a certain operator on the space of upper semi-continuous functions. A stationary policy is optimal over all measurable policies in the transient and discounted cases as well as, with certain added conditions, in the positive and negative cases.  相似文献   

10.
We consider the problem of combining replacements of multiple components in an operational planning phase. Within an infinite or finite time horizon, decisions concerning replacement of components are made at discrete time epochs. The optimal solution of this problem is limited to only a small number of components. We present a heuristic rolling horizon approach that decomposes the problem; at each decision epoch an initial plan is made that addresses components separately, and subsequently a deviation from this plan is allowed to enable joint replacement. This approach provides insight into why certain actions are taken. The time needed to determine an action at a certain epoch is only quadratic in the number of components. After dealing with harmonisation and horizon effects, our approach yields average costs less than 1% above the minimum value.  相似文献   

11.
Rim Amami 《Optimization》2013,62(11):1525-1552
We establish existence results for adapted solutions of infinite horizon backward stochastic differential equations with two reflected barriers. We also apply these results to get the existence of an optimal impulse control strategy for the infinite horizon impulse control problem. The properties of the Snell envelope reduce our problem to the existence of a pair of continuous processes.  相似文献   

12.
This paper studies the problem of a company that adjusts its stochastic production capacity in reversible investments with controls of expansion and contraction. The company may also decide on the activation time of its production. The profit production function is of a very general form satisfying minimal standard assumptions. The objective of the company is to find an optimal entry and production decision to maximize its expected total net profit over an infinite time horizon. The resulting dynamic programming principle is a two-step formulation of a singular stochastic control problem and an optimal stopping problem. The analysis of value functions relies on viscosity solutions of the associated Bellman variational inequations. We first state several general properties and in particular smoothness results on the value functions. We then provide a complete solution with explicit expressions of the value functions and the optimal controls: the company activates its production once a fixed entry-threshold of the capacity is reached, and invests in capital so as to maintain its capacity in a closed bounded interval. The boundaries of these regions can be computed explicitly and their behavior is studied in terms of the parameters of the model.  相似文献   

13.
In this paper a single facility location problem with multiple relocation opportunities is investigated. The weight associated with each demand point is a known function of time. We consider either rectilinear, or squared Euclidean, or Euclidean distances. Relocations can take place at pre-determined times. The objective function is to minimize the total location and relocation costs. An algorithm which finds the optimal locations, relocation times and the total cost, for all three types of distance measurements and various weight functions, is developed. Locations are found using constant weights, and relocations times are the solution to a Dynamic Programming or Binary Integer Programming (BIP) model. The time horizon can be finite or infinite.  相似文献   

14.
We present in this paper several asymptotic properties of constrained Markov Decision Processes (MDPs) with a countable state space. We treat both the discounted and the expected average cost, with unbounded cost. We are interested in (1) the convergence of finite horizon MDPs to the infinite horizon MDP, (2) convergence of MDPs with a truncated state space to the problem with infinite state space, (3) convergence of MDPs as the discount factor goes to a limit. In all these cases we establish the convergence of optimal values and policies. Moreover, based on the optimal policy for the limiting problem, we construct policies which are almost optimal for the other (approximating) problems. Based on the convergence of MDPs with a truncated state space to the problem with infinite state space, we show that an optimal stationary policy exists such that the number of randomisations it uses is less or equal to the number of constraints plus one. We finally apply the results to a dynamic scheduling problem.This work was partially supported by the Chateaubriand fellowship from the French embassy in Israel and by the European Grant BRA-QMIPS of CEC DG XIII  相似文献   

15.
In this paper, we consider discrete-time systems. We study conditions under which there is a unique control that minimizes a general quadratic cost functional. The system considered is described by a linear time-invariant recurrence equation in which the number of inputs equals the number of states. The cost functional differs from the usual one considered in optimal control theory, in the sense that we do not assume that the weight matrices considered are semipositive definite. For both a finite planning horizon and an infinite horizon, necessary and sufficient solvability conditions are given. Furthermore, necessary and sufficient conditions are derived for the existence of a solution for an arbitrary finite planning horizon.The author dedicates this paper to the memory of his late grandfather Jacob Oosterwold.  相似文献   

16.
We prove a general theorem that the -valued solution of an infinite horizon backward doubly stochastic differential equation, if exists, gives the stationary solution of the corresponding stochastic partial differential equation. We prove the existence and uniqueness of the -valued solutions for backward doubly stochastic differential equations on finite and infinite horizon with linear growth without assuming Lipschitz conditions, but under the monotonicity condition. Therefore the solution of finite horizon problem gives the solution of the initial value problem of the corresponding stochastic partial differential equations, and the solution of the infinite horizon problem gives the stationary solution of the SPDEs according to our general result.  相似文献   

17.
We consider the optimal consumption-investment problem under the drawdown constraint, i.e. the wealth process never falls below a fixed fraction of its running maximum. We assume that the risky asset is driven by the constant coefficients Black and Scholes model and we consider a general class of utility functions. On an infinite time horizon, Elie and Touzi (Preprint, [2006]) provided the value function as well as the optimal consumption and investment strategy in explicit form. In a more realistic setting, we consider here an agent optimizing its consumption-investment strategy on a finite time horizon. The value function interprets as the unique discontinuous viscosity solution of its corresponding Hamilton-Jacobi-Bellman equation. This leads to a numerical approximation of the value function and allows for a comparison with the explicit solution in infinite horizon.  相似文献   

18.
This paper studies a single-product, dynamic, non-stationary, stochastic inventory problem with capacity commitment, in which a buyer purchases a fixed capacity from a supplier at the beginning of a planning horizon and the buyer’s total cumulative order quantity over the planning horizon is constrained with the capacity. The objective of the buyer is to choose the capacity at the beginning of the planning horizon and the order quantity in each period to minimize the expected total cost over the planning horizon. We characterize the structure of the minimum sum of the expected ordering, storage and shortage costs in a period and thereafter and the optimal ordering policy for a given capacity. Based on the structure, we identify conditions under which a myopic ordering policy is optimal and derive an equation for the optimal capacity commitment. We then use the optimal capacity and the myopic ordering policy to evaluate the effect of the various parameters on the minimum expected total cost over the planning horizon.  相似文献   

19.
We consider the minimizing risk problems in discounted Markov decisions processes with countable state space and bounded general rewards. We characterize optimal values for finite and infinite horizon cases and give two sufficient conditions for the existence of an optimal policy in an infinite horizon case. These conditions are closely connected with Lemma 3 in White (1993), which is not correct as Wu and Lin (1999) point out. We obtain a condition for the lemma to be true, under which we show that there is an optimal policy. Under another condition we show that an optimal value is a unique solution to some optimality equation and there is an optimal policy on a transient set.  相似文献   

20.
Time-discrete systems with a finite set of states are considered. Discrete optimal control problems with infinite time horizon for such systems are formulated. We introduce a certain graph-theoretic structure to model the transitions of the dynamical system. Algorithms for finding the optimal stationary control parameters are presented. Furthermore, we determine the optimal mean cost cycles. This approach can be used as a decision support strategy within such a class of problems; especially so-called multilayered decision problems which occur within environmental emission trading procedures can be modelled by such an approach.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号