首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
《Optimization》2012,61(7):1593-1623
This paper deals with the ratio and time expected average criteria for constrained semi-Markov decision processes (SMDPs). The state and action spaces are Polish spaces, the rewards and costs are unbounded from above and from below, and the mean holding times are allowed to be unbounded from above. First, under general conditions we prove the existence of constrained-optimal policies for the ratio expected average criterion by developing a technique of occupation measures including the mean holding times for SMDPs, which are the generalizations of those for the standard discrete-time and continuous-time MDPs. Then, we give suitable conditions under which we establish the equivalence of the two average criteria by the optional sampling theorem, and thus we show the existence of constrained-optimal policies for the time expected average criterion. Finally, we illustrate the application of our main results with a controlled linear system, for which an exact optimal policy is obtained.  相似文献   

2.
The paper deals with value functions for optimal stopping and impulsive control for piecewise-deterministic processes with discounted cost. The associated dynamic programming equations are variational and quasi-variational inequalities with integral and first-order differential terms The technique used is to approximate the value functions for an optimal stopping (impulsive control. switching control) problem for a piecewise-deterministic process by value functions for optimal stopping (impulsive control, switching control) problems for Feller piecewise-deterministic processes  相似文献   

3.
This paper investigates finite horizon semi-Markov decision processes with denumerable states. The optimality is over the class of all randomized history-dependent policies which include states and also planning horizons, and the cost rate function is assumed to be bounded below. Under suitable conditions, we show that the value function is a minimum nonnegative solution to the optimality equation and there exists an optimal policy. Moreover, we develop an effective algorithm for computing optimal policies, derive some properties of optimal policies, and in addition, illustrate our main results with a maintenance system.  相似文献   

4.
We characterize the value function and the optimal stopping time for a large class of optimal stopping problems where the underlying process to be stopped is a fairly general Markov process. The main result is inspired by recent findings for Lévy processes obtained essentially via the Wiener–Hopf factorization. The main ingredient in our approach is the representation of the ββ-excessive functions as expected suprema. A variety of examples is given.  相似文献   

5.
In this paper we consider stopping problems for continuous-time Markov chains under a general risk-sensitive optimization criterion for problems with finite and infinite time horizon. More precisely our aim is to maximize the certainty equivalent of the stopping reward minus cost over the time horizon. We derive optimality equations for the value functions and prove the existence of optimal stopping times. The exponential utility is treated as a special case. In contrast to risk-neutral stopping problems it may be optimal to stop between jumps of the Markov chain. We briefly discuss the influence of the risk sensitivity on the optimal stopping time and consider a special house selling problem as an example.  相似文献   

6.
Decision makers often face the need of performance guarantee with some sufficiently high probability. Such problems can be modelled using a discrete time Markov decision process (MDP) with a probability criterion for the first achieving target value. The objective is to find a policy that maximizes the probability of the total discounted reward exceeding a target value in the preceding stages. We show that our formulation cannot be described by former models with standard criteria. We provide the properties of the objective functions, optimal value functions and optimal policies. An algorithm for computing the optimal policies for the finite horizon case is given. In this stochastic stopping model, we prove that there exists an optimal deterministic and stationary policy and the optimality equation has a unique solution. Using perturbation analysis, we approximate general models and prove the existence of e-optimal policy for finite state space. We give an example for the reliability of the satellite sy  相似文献   

7.
We consider optimal stopping of independent sequences. Assuming that the corresponding imbedded planar point processes converge to a Poisson process we introduce some additional conditions which allow to approximate the optimal stopping problem of the discrete time sequence by the optimal stopping of the limiting Poisson process. The optimal stopping of the involved Poisson processes is reduced to a differential equation for the critical curve which can be solved in several examples. We apply this method to obtain approximations for the stopping of iid sequences in the domain of max-stable laws with observation costs and with discount factors.  相似文献   

8.
In this paper we consider the problem of optimal stopping and continuous control on some local parameters of a piecewise-deterministic Markov processes (PDP's). Optimality equations are obtained in terms of a set of variational inequalities as well as on the first jump time operator of the PDP. It is shown that if the final cost function is absolutely continuous along trajectories then so is the value function of the optimal stopping problem with continuous control. These results unify and generalize previous ones in the current literature.  相似文献   

9.
A finite collection of piecewise-deterministic processes are controlled in order to minimize the expected value of a performance functional with continuous operating cost and discrete switching control costs. The solution of the associated dynamic programming equation is obtained by an iterative approximation using optimal stopping time problems.This research was supported in part by NSF Grant No. DMS-8508651 and by University of Tennessee Science Alliance Research Incentive Award.  相似文献   

10.
This paper is concerned with the adaptive control problem, over the infinite horizon, for partially observable Markov decision processes whose transition functions are parameterized by an unknown vector. We treat finite models and impose relatively mild assumptions on the transition function. Provided that a sequence of parameter estimates converging in probability to the true parameter value is available, we show that the certainty equivalence adaptive policy is optimal in the long-run average sense.  相似文献   

11.
We develop an approach for solving one-sided optimal stopping problems in discrete time for general underlying Markov processes on the real line. The main idea is to transform the problem into an auxiliary problem for the ladder height variables. In case that the original problem has a one-sided solution and the auxiliary problem has a monotone structure, the corresponding myopic stopping time is optimal for the original problem as well. This elementary line of argument directly leads to a characterization of the optimal boundary in the original problem. The optimal threshold is given by the threshold of the myopic stopping time in the auxiliary problem. Supplying also a sufficient condition for our approach to work, we obtain solutions for many prominent examples in the literature, among others the problems of Novikov-Shiryaev, Shepp-Shiryaev, and the American put in option pricing under general conditions. As a further application we show that for underlying random walks (and Lévy processes in continuous time), general monotone and log-concave reward functions g lead to one-sided stopping problems.  相似文献   

12.
We study two classes of stochastic control problems with semicontinuous cost: the Mayer problem and optimal stopping for controlled diffusions. The value functions are introduced via linear optimization problems on appropriate sets of probability measures. These sets of constraints are described deterministically with respect to the coefficient functions. Both the lower and upper semicontinuous cases are considered. The value function is shown to be a generalized viscosity solution of the associated HJB system, respectively, of some variational inequality. Dual formulations are given, as well as the relations between the primal and dual value functions. Under classical convexity assumptions, we prove the equivalence between the linearized Mayer problem and the standard weak control formulation. Counter-examples are given for the general framework.  相似文献   

13.
This paper presents a basic formula for performance gradient estimation of semi-Markov decision processes (SMDPs) under average-reward criterion. This formula directly follows from a sensitivity equation in perturbation analysis. With this formula, we develop three sample-path-based gradient estimation algorithms by using a single sample path. These algorithms naturally extend many gradient estimation algorithms for discrete-time Markov systems to continuous time semi-Markov models. In particular, they require less storage than the algorithm in the literature.  相似文献   

14.
We develop an eigenfunction expansion based value iteration algorithm to solve discrete time infinite horizon optimal stopping problems for a rich class of Markov processes that are important in applications. We provide convergence analysis for the value function and the exercise boundary, and derive easily computable error bounds for value iterations. As an application we develop a fast and accurate algorithm for pricing callable perpetual bonds under the CIR short rate model.  相似文献   

15.
Consider the optimal stopping problem of a one-dimensional diffusion with positive discount. Based on Dynkin's characterization of the value as the minimal excessive majorant of the reward and considering its Riesz representation, we give an explicit equation to find the optimal stopping threshold for problems with one-sided stopping regions, and an explicit formula for the value function of the problem. This representation also gives light on the validity of the smooth-fit (SF) principle. The results are illustrated by solving some classical problems, and also through the solution of: optimal stopping of the skew Brownian motion and optimal stopping of the sticky Brownian motion, including cases in which the SF principle fails.  相似文献   

16.
In this paper we demonstrate how to develop analytic closed form solutions to optimal multiple stopping time problems arising in the setting in which the value function acts on a compound process that is modified by the actions taken at the stopping times. This class of problem is particularly relevant in insurance and risk management settings and we demonstrate this on an important application domain based on insurance strategies in Operational Risk management for financial institutions. In this area of risk management the most prevalent class of loss process models is the Loss Distribution Approach (LDA) framework which involves modelling annual losses via a compound process. Given an LDA model framework, we consider Operational Risk insurance products that mitigate the risk for such loss processes and may reduce capital requirements. In particular, we consider insurance products that grant the policy holder the right to insure k of its annual Operational losses in a horizon of T years. We consider two insurance product structures and two general model settings, the first are families of relevant LDA loss models that we can obtain closed form optimal stopping rules for under each generic insurance mitigation structure and then secondly classes of LDA models for which we can develop closed form approximations of the optimal stopping rules. In particular, for losses following a compound Poisson process with jump size given by an Inverse-Gaussian distribution and two generic types of insurance mitigation, we are able to derive analytic expressions for the loss process modified by the insurance application, as well as closed form solutions for the optimal multiple stopping rules in discrete time (annually). When the combination of insurance mitigation and jump size distribution does not lead to tractable stopping rules we develop a principled class of closed form approximations to the optimal decision rule. These approximations are developed based on a class of orthogonal Askey polynomial series basis expansion representations of the annual loss compound process distribution and functions of this annual loss.  相似文献   

17.
We find the closed form formula for the price of the perpetual American lookback spread option, whose payoff is the difference of the running maximum and minimum prices of a single asset. We solve an optimal stopping problem related to both maximum and minimum. We show that the spread option is equivalent to some fixed strike options on some domains, find the exact form of the optimal stopping region, and obtain the solution of the resulting partial differential equations. The value function is not differentiable. However, we prove the verification theorem due to the monotonicity of the maximum and minimum processes.  相似文献   

18.
We consider a general continuous-time finite-horizon single-agent consumption and portfolio decision problem with subsistence consumption and value of bankruptcy. Our analysis allows for random market coefficients and general continuously differentiable concave utility functions. We study the time of bankruptcy as a problem of optimal stopping, and succeed in obtaining explicit formulas for the optimal consumption and wealth processes in terms of the optimal bankruptcy time. This paper extends the results of Karatzas, Lehoczky, and Shreve (Ref. 1) on the maximization of expected utility from consumption in a financial market with random coefficients by incorporating subsistence consumption and bankruptcy. It also addresses the random coefficients and finite-horizon version of the problem treated by Sethi, Taksar, and Presman (Ref. 2). The mathematical tools used in our analysis are optimal stopping, stochastic control, martingale theory, and Girsanov change of measure.  相似文献   

19.
We study an infinite horizon optimal stopping Markov problem which is either undiscounted (total reward) or with a general Markovian discount rate. Using ergodic properties of the underlying Markov process, we establish the feasibility of the stopping problem and prove the existence of optimal and εε-optimal stopping times. We show the continuity of the value function and its variational characterisation (in the viscosity sense) under different sets of assumptions satisfied by large classes of diffusion and jump–diffusion processes. In the case of a general discounted problem we relax a classical assumption that the discount rate is uniformly separated from zero.  相似文献   

20.
This paper addresses the problem of buying an asset at its expected globally minimal price, to that end, we model it as an optimal stopping problem with regime switching driven by a continuous-time Markov chain. We characterize the optimal stopping time by optimizing the value functions and writing them as solutions of a system of integral equations. Finally we develop a stochastic recursive algorithm for numerical implementation.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号