共查询到20条相似文献,搜索用时 0 毫秒
1.
2.
This paper deals with discrete-time Markov control processes withBorel state and control spaces, with possiblyunbounded costs andnoncompact control constraint sets, and the average cost criterion. Conditions are given for the convergence of the value iteration algorithm to the optimal average cost, and for a sequence of finite-horizon optimal policies to have an accumulation point which is average cost optimal.This research was partially supported by the Consejo Nacional de Ciencia y Tecnología (CONACyT) under grant 1332-E9206. 相似文献
3.
Nadine Hilgert J. Adolfo Minjárez-Sosa 《Mathematical Methods of Operations Research》2001,54(3):491-505
We consider a class of time-varying stochastic control systems, with Borel state and action spaces, and possibly unbounded
costs. The processes evolve according to a discrete-time equation x
n + 1=G
n (x
n , a
n , ξn), n=0, 1, … , where the ξn are i.i.d. ℜk-valued random vectors whose common density is unknown, and the G
n are given functions converging, in a restricted way, to some function G
∞ as n→∞. Assuming observability of ξn, we construct an adaptive policy which is asymptotically discounted cost optimal for the limiting control system
x
n+1=G
∞ (x
n , a
n , ξn). 相似文献
4.
5.
Tomás Prieto-Rumeau 《Acta Appl Math》2006,92(1):77-96
This paper deals with Blackwell optimality for continuous-time controlled Markov chains with compact Borel action space, and possibly unbounded reward (or cost) rates and unbounded transition rates. We prove the existence of a deterministic stationary policy which is Blackwell optimal in the class of all admissible (nonstationary) Markov policies, thus extending previous results that analyzed Blackwell optimality in the class of stationary policies. We compare our assumptions to the corresponding ones for discrete-time Markov controlled processes. 相似文献
6.
7.
We consider a class of discrete-time Markov control processes with Borel state and action spaces, and d i.i.d. disturbances with unknown distribution . Under mild semi-continuity and compactness conditions, and assuming that is absolutely continuous with respect to Lebesgue measure, we establish the existence of adaptive control policies which are (1) optimal for the average-reward criterion, and (2) asymptotically optimal in the discounted case. Our results are obtained by taking advantage of some well-known facts in the theory of density estimation. This approach allows us to avoid restrictive conditions on the state space and/or on the system's transition law imposed in recent works, and on the other hand, it clearly shows the way to other applications of nonparametric (density) estimation to adaptive control.Research partially supported by The Third World Academy of Sciences under Research Grant No. MP 898-152. 相似文献
8.
This paper presents a nonlinear, multi-phase and stochastic dynamical system according to engineering background. We show that the stochastic dynamical system exists a unique solution for every initial state. A stochastic optimal control model is constructed and the sufficient and necessary conditions for optimality are proved via dynamic programming principle. This model can be converted into a parametric nonlinear stochastic programming by integrating the state equation. It is discussed here that the local optimal solution depends in a continuous way on the parameters. A revised Hooke–Jeeves algorithm based on this property has been developed. Computer simulation is used for this paper, and the numerical results illustrate the validity and efficiency of the algorithm. 相似文献
9.
Walter Alt 《Numerical Functional Analysis & Optimization》2013,34(11-12):1065-1076
In this paper we study a problem of parameter estimation in two point boundary value problems. Using a stability theorem for nonlinear cone constrained optimization problems derived in Part 1 of this paper we investigate stability properties of the solutions of the parameter estimation problem in the output-least-squares formulation. 相似文献
10.
In this paper we are concerned with the existence of optimal stationary policies for infinite-horizon risk-sensitive Markov
control processes with denumerable state space, unbounded cost function, and long-run average cost. Introducing a discounted
cost dynamic game, we prove that its value function satisfies an Isaacs equation, and its relationship with the risk-sensitive
control problem is studied. Using the vanishing discount approach, we prove that the risk-sensitive dynamic programming inequality
holds, and derive an optimal stationary policy.
Accepted 1 October 1997 相似文献
11.
B. Dochviri 《Georgian Mathematical Journal》1995,2(4):335-346
The connection between the optimal stopping problems for inhomogeneous standard Markov process and the corresponding homogeneous Markov process constructed in the extended state space is established. An excessive characterization of the value-function and the limit procedure for its construction in the problem of optimal stopping of an inhomogeneous standard Markov process is given. The form of -optimal (optimal) stopping times is also found. 相似文献
12.
Adrien Brandejsky Benoîte de Saporta François Dufour 《Stochastic Processes and their Applications》2013
This paper deals with the optimal stopping problem under partial observation for piecewise-deterministic Markov processes. We first obtain a recursive formulation of the optimal filter process and derive the dynamic programming equation of the partially observed optimal stopping problem. Then, we propose a numerical method, based on the quantization of the discrete-time filter process and the inter-jump times, to approximate the value function and to compute an ?-optimal stopping time. We prove the convergence of the algorithms and bound the rates of convergence. 相似文献
13.
A new approach to the optimal control of diffusion processes based on Lagrange functionals is presented. The method is conceptually and technically simpler than existing ones. A first class of functionals allows to obtain optimality conditions without any resort to stochastic calculus and functional analysis. A second class, which requires Ito's rule, allows to establish optimality in a larger class of problems. Calculations in these two methods are sometimes akin to those in minimum principles and in dynamic programming, but the thinking behind them is new. A few examples are worked out to illustrate the power and simplicity of this approach.Research performed at the Mathematisches Seminar der Universität Kiel with support provided by an Alexander von Humboldt Foundation fellowship. 相似文献
14.
In this paper we consider the problem of optimal stopping and continuous control on some local parameters of a piecewise-deterministic Markov processes (PDP's). Optimality equations are obtained in terms of a set of variational inequalities as well as on the first jump time operator of the PDP. It is shown that if the final cost function is absolutely continuous along trajectories then so is the value function of the optimal stopping problem with continuous control. These results unify and generalize previous ones in the current literature. 相似文献
15.
Impulsive control of continuous-time Markov processes with risk- sensitive long-run average cost is considered. The most
general impulsive control problem is studied under the restriction that impulses are in dyadic moments only. In a particular
case of additive cost for impulses, the impulsive control problem is solved without restrictions on the moments of impulses.
Accepted 30 April 2001. Online publication 29 August 2001. 相似文献
16.
17.
In this paper we consider the problem of impulse and continuous control on the jump rate and post jump location parameters of piecewise-deterministic Markov processes (PDP's). In a companion paper we studied the optimal stopping with continuous control problem of PDP's assuming only absolutely continuity along trajectories hypothesis on the final cost function. In this paper we apply these results to obtain optimality equations for the impulse and continuous control problem of PDP's in terms of a set of quasi-variational inequalities as well as on the first jump time operator of the process. No continuity or differential assumptions on the whole state space, neither stability assumptions on the parameters of the problem are required. It is shown that if the post intervention operator satisfies some locally lipschitz continuity along trajectories properties then so will the value function of the impulse and continuous control problem. 相似文献
18.
19.
We consider the Bellman equation related to the quadratic ergodic control problem for stochastic differential systems with
controller constraints. We solve this equation rigidly in C
2
-class, and give the minimal value and the optimal control.
Accepted 9 January 1997 相似文献
20.
Ronald Ortner 《Operations Research Letters》2007,35(5):619-626
In ergodic MDPs we consider stationary distributions of policies that coincide in all but n states, in which one of two possible actions is chosen. We give conditions and formulas for linear dependence of the stationary distributions of n+2 such policies, and show some results about combinations and mixtures of policies. 相似文献