首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Some problems of ergodic control and adaptive control are formulated and solved for stochastic differential delay systems. The existence and the uniqueness of invariant measures that are solutions of the stochastic functional differential equations for these systems are verified. For an ergodic cost criterion, almost optimal controls are constructed. For an unknown system, the invariant measures and the optimal ergodic costs are shown to be continuous functions of the unknown parameters. Almost self-optimizing adaptive controls are feasibly constructed by an approximate certainty equivalence principle.This research was partially supported by NSF Grants ECS-91-02714 and ECS91-13029.  相似文献   

2.
An adaptive control problem is formulated and solved for a completely observed, continuous-time, linear stochastic system with an ergodic quadratic cost criterion. The linear transformationsA of the state,B of the control, andC of the noise are assumed to be unknown. Assuming only thatA is stable and that the pair (A, C) is controllable and using a diminishing excitation control that is asymptotically negligible for an ergodic, quadratic cost criterion it is shown that a family of least-squares estimates is strongly consistent. Furthermore, an adaptive control is given using switchings that is self-optimizing for an ergodic, quadratic cost criterion.This research was partially supported b y NSF Grants ECS-9102714, ECS-9113029, and DMS-9305936.  相似文献   

3.
This paper concerns nonstationary continuous-time Markov control processes on Polish spaces, with the infinite-horizon discounted cost criterion. Necessary and sufficient conditions are given for a control policy to be optimal and asymptotically optimal. In addition, under suitable hypotheses, it is shown that the successive approximation procedure converges in the sense that the sequence of finite-horizon optimal cost functions and the corresponding optimal control policies both converge.  相似文献   

4.
We consider a discrete-time constrained Markov decision process under the discounted cost optimality criterion. The state and action spaces are assumed to be Borel spaces, while the cost and constraint functions might be unbounded. We are interested in approximating numerically the optimal discounted constrained cost. To this end, we suppose that the transition kernel of the Markov decision process is absolutely continuous with respect to some probability measure μ  . Then, by solving the linear programming formulation of a constrained control problem related to the empirical probability measure μnμn of μ, we obtain the corresponding approximation of the optimal constrained cost. We derive a concentration inequality which gives bounds on the probability that the estimation error is larger than some given constant. This bound is shown to decrease exponentially in n. Our theoretical results are illustrated with a numerical application based on a stochastic version of the Beverton–Holt population model.  相似文献   

5.
《Optimization》2012,61(2):179-196
This article concerns n-dimensional controlled diffusion processes. The main problem is to maximize a certain long-run average reward (also known as an ergodic reward) in such a way that a given long-run average cost is bounded above by a constant. Under suitable assumptions, the existence of optimal controls for such constrained control problems is a well-known fact. In this article we go a bit further and our goal is to introduce a technique to compute optimal controls. To this end, we follow the Lagrange multipliers approach. An example on a linear-quadratic system illustrates our results.  相似文献   

6.
We are concerned with a numerical method for the optimal control of the reflection directions of a reflected diffusion process, and with either a discounted or an ergodic cost criterion. The Markov chain approximation method is adapted to this nonclassical problem and convergence of the method is proved. The problem originally arose in the important application of rerouting in large trunk-line communications systems. A reasonable heavy-traffic approximation leads to the described model. The numerical method has been applied with success, but no proofs of convergence were given previously. Nonboundary controls can be easily added.This work was partially supported by Grants (AFOSR) F49620-92-J-008-1DEF, AFOSR-91-0375, and DAAH04-93-6-0070.  相似文献   

7.
In this paper, we consider constrained noncooperative N-person stochastic games with discounted cost criteria. The state space is assumed to be countable and the action sets are compact metric spaces. We present three main results. The first concerns the sensitivity or approximation of constrained games. The second shows the existence of Nash equilibria for constrained games with a finite state space (and compact actions space), and, finally, in the third one we extend that existence result to a class of constrained games which can be “approximated” by constrained games with finitely many states and compact action spaces. Our results are illustrated with two examples on queueing systems, which clearly show some important differences between constrained and unconstrained games.Mathematics Subject Classification (2000): Primary: 91A15. 91A10; Secondary: 90C40  相似文献   

8.
For a stochastic differential inclusion given in terms of current velocities (symmetric mean derivatives) on flat n-dimensional torus, we prove the existence of optimal solution minimizing a certain cost criterion. Then this result is applied to the problem of optimal control for equations with current velocities.  相似文献   

9.
We consider a controlled system driven by a coupled forward–backward stochastic differential equation with a non degenerate diffusion matrix. The cost functional is defined by the solution of the controlled backward stochastic differential equation, at the initial time. Our goal is to find an optimal control which minimizes the cost functional. The method consists to construct a sequence of approximating controlled systems for which we show the existence of a sequence of feedback optimal controls. By passing to the limit, we establish the existence of a relaxed optimal control to the initial problem. The existence of a strict control follows from the Filippov convexity condition.  相似文献   

10.
We consider Markov Decision Processes under light traffic conditions. We develop an algorithm to obtain asymptotically optimal policies for both the total discounted and the average cost criterion. This gives a general framework for several light traffic results in the literature. We illustrate the method by deriving the asymptotically optimal control of a simple ATM network.  相似文献   

11.
In this paper, we study the infinite-horizon expected discounted continuous-time optimal control problem for Piecewise Deterministic Markov Processes with both impulsive and gradual (also called continuous) controls. The set of admissible control strategies is supposed to be formed by policies possibly randomized and depending on the past-history of the process. We assume that the gradual control acts on the jump intensity and on the transition measure, but not on the flow. The so-called Hamilton–Jacobi–Bellman (HJB) equation associated to this optimization problem is analyzed. We provide sufficient conditions for the existence of a solution to the HJB equation and show that the solution is in fact unique and coincides with the value function of the control problem. Moreover, the existence of an optimal control strategy is proven having the property to be stationary and non-randomized.  相似文献   

12.
For the ergodic control problem with degenerate diffusions, the existence of an optimal solution is established for various interesting classes of solutions.This research was supported by Grant No. 26/01/92-G from the Department of Atomic Energy, Government of India, Delhi, India.  相似文献   

13.
In this work we consider an L minimax ergodic optimal control problem with cumulative cost. We approximate the cost function as a limit of evolutions problems. We present the associated Hamilton-Jacobi-Bellman equation and we prove that it has a unique solution in the viscosity sense. As this HJB equation is consistent with a numerical procedure, we use this discretization to obtain a procedure for the primitive problem. For the numerical solution of the ergodic version we need a perturbation of the instantaneous cost function. We give an appropriate selection of the discretization and penalization parameters to obtain discrete solutions that converge to the optimal cost. We present numerical results. (© 2008 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim)  相似文献   

14.
In this note, we consider the adaptive control of a linear diffusion process with regard to the discounted cost criterion. We show that the certainty-equivalence type of control, analogous to the one considered by Duncan and Pasik-Duncan (Ref. 1), is asymptotically discount optimal in the sense of Schäl (Ref. 2).  相似文献   

15.
Complementing existing results on minimal ruin probabilities, we minimize expected discounted penalty functions (or Gerber–Shiu functions) in a Cramér–Lundberg model by choosing optimal reinsurance. Reinsurance strategies are modeled as time dependent control functions, which lead to a setting from the theory of optimal stochastic control and ultimately to the problem’s Hamilton–Jacobi–Bellman equation. We show existence and uniqueness of the solution found by this method and provide numerical examples involving light and heavy tailed claims and also give a remark on the asymptotics.  相似文献   

16.
We study the ergodic control problem for a class of controlled jump diffusions driven by a compound Poisson process. This extends the results of Arapostathis et al. (2019) to running costs that are not near-monotone. This generality is needed in applications such as optimal scheduling of large-scale parallel server networks.We provide a full characterizations of optimality via the Hamilton–Jacobi–Bellman (HJB) equation, for which we additionally exhibit regularity of solutions under mild hypotheses. In addition, we show that optimal stationary Markov controls are a.s. pathwise optimal. Lastly, we show that one can fix a stable control outside a compact set and obtain near-optimal solutions by solving the HJB on a sufficiently large bounded domain. This is useful for constructing asymptotically optimal scheduling policies for multiclass parallel server networks.  相似文献   

17.
一类带停时的最优控制策略研究   总被引:5,自引:1,他引:4  
利用随机分析的知识及最优控制理论,讨论了一类带停时随机控制的折扣费用模型,将原模型中费用结构中的R-S积分的被积函数同上1推广为满足某些条件的一般函数,推广后的模型更具一般性.针对不同参数,证明了最佳控制的存在性,并刻划了不同初始状态下,最优控制策略的结构及最佳费用函数的形式.  相似文献   

18.
We study an infinite horizon optimal stopping Markov problem which is either undiscounted (total reward) or with a general Markovian discount rate. Using ergodic properties of the underlying Markov process, we establish the feasibility of the stopping problem and prove the existence of optimal and εε-optimal stopping times. We show the continuity of the value function and its variational characterisation (in the viscosity sense) under different sets of assumptions satisfied by large classes of diffusion and jump–diffusion processes. In the case of a general discounted problem we relax a classical assumption that the discount rate is uniformly separated from zero.  相似文献   

19.
郭先平 《数学学报》2001,44(2):333-342
本文考虑具有 Borel状态空间和行动空间非平稳 MDP的平均方差准则.首先,在遍历条件下,利用最优方程,证明了关于平均期望目标最优马氏策略的存在性.然后,通过构造新的模型,利用马氏过程的理论,进一步证明了在关于平均期望目标是最优的一类马氏策略中,存在一个马氏策略使得平均方差达到最小.作为本文的特例还得到了 Dynkin E. B.和 Yushkevich A. A.及 Kurano M.等中的主要结果.  相似文献   

20.
We study risk-sensitive differential games for controlled reflecting diffusion processes in a bounded domain. We consider both nonzero-sum and zero-sum cases. We treat two cost evaluation criteria; namely, discounted cost and ergodic cost. Under certain assumptions we establish the existence of Nash/saddle-point equilibria for relevant cases.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号