首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 296 毫秒
1.
We consider undiscounted semi-Markov decision process with a target set and our main concern is a problem minimizing threshold probability. We formulate the problem as an infinite horizon case with a recurrent class. We show that an optimal value function is a unique solution to an optimality equation and there exists a stationary optimal policy. Also several value iteration methods and a policy improvement method are given in our model. Furthermore, we investigate a relationship between threshold probabilities and expectations for total rewards.  相似文献   

2.

We consider optimal pricing for a two-station tandem queueing system with finite buffers, communication blocking, and price-sensitive customers whose arrivals form a homogeneous Poisson process. The service provider quotes prices to incoming customers using either a static or dynamic pricing scheme. There may also be a holding cost for each customer in the system. The objective is to maximize either the discounted profit over an infinite planning horizon or the long-run average profit of the provider. We show that there exists an optimal dynamic policy that exhibits a monotone structure, in which the quoted price is non-decreasing in the queue length at either station and is non-increasing if a customer moves from station 1 to 2, for both the discounted and long-run average problems under certain conditions on the holding costs. We then focus on the long-run average problem and show that the optimal static policy performs as well as the optimal dynamic policy when the buffer size at station 1 becomes large, there are no holding costs, and the arrival rate is either small or large. We learn from numerical results that for systems with small arrival rates and no holding cost, the optimal static policy produces a gain quite close to the optimal gain even when the buffer at station 1 is small. On the other hand, for systems with arrival rates that are not small, there are cases where the optimal dynamic policy performs much better than the optimal static policy.

  相似文献   

3.
Stochastic scheduling problems are considered by using discounted dynamic programming. Both, maximizing pure rewards and minimizing linear holding costs are treated in one common Markov decision problem. A sufficient condition for the optimality of the myopic policy for finite and infinite horizon is given. For the infinite horizon case we show the optimality of an index policy and give a sufficient condition for the index policy to be myopic. Moreover, the relation between the two sufficient conditions is discussed.  相似文献   

4.
5.
We provide weak sufficient conditions for a full-service policy to be optimal in a queueing control problem in which the service rate is a dynamic decision variable. In our model there are service costs and holding costs and the objective is to minimize the expected total discounted cost over an infinite horizon. We begin with a semi-Markov decision model for a single-server queue with exponentially distributed inter-arrival and service times. Then we present a general model with weak probabilistic assumptions and demonstrate that the full-service policy minimizes both finite-horizon and infinite-horizon total discounted cost on each sample path.  相似文献   

6.
In a M/M/N+M queue, when there are many customers waiting, it may be preferable to reject a new arrival rather than risk that arrival later abandoning without receiving service. On the other hand, rejecting new arrivals increases the percentage of time servers are idle, which also may not be desirable. We address these trade-offs by considering an admission control problem for a M/M/N+M queue when there are costs associated with customer abandonment, server idleness, and turning away customers. First, we formulate the relevant Markov decision process (MDP), show that the optimal policy is of threshold form, and provide a simple and efficient iterative algorithm that does not presuppose a bounded state space to compute the minimum infinite horizon expected average cost and associated threshold level. Under certain conditions we can guarantee that the algorithm provides an exact optimal solution when it stops; otherwise, the algorithm stops when a provided bound on the optimality gap is reached. Next, we solve the approximating diffusion control problem (DCP) that arises in the Halfin–Whitt many-server limit regime. This allows us to establish that the parameter space has a sharp division. Specifically, there is an optimal solution with a finite threshold level when the cost of an abandonment exceeds the cost of rejecting a customer; otherwise, there is an optimal solution that exercises no control. This analysis also yields a convenient analytic expression for the infinite horizon expected average cost as a function of the threshold level. Finally, we propose a policy for the original system that is based on the DCP solution, and show that this policy is asymptotically optimal. Our extensive numerical study shows that the control that arises from solving the DCP achieves a very similar cost to the control that arises from solving the MDP, even when the number of servers is small.  相似文献   

7.
We determine replenishment and sales decisions jointly for an inventory system with random demand, lost sales and random yield. Demands in consecutive periods are independent random variables and their distributions are known. We incorporate discretionary sales, when inventory may be set aside to satisfy future demand even if some present demand may be lost. Our objective is to minimize the total discounted cost over the problem horizon by choosing an optimal replenishment and discretionary sales policy. We obtain the structure of the optimal replenishment and discretionary sales policy and show that the optimal policy for finite horizon problem converges to that of the infinite horizon problem. Moreover, we compare the optimal policy under random yield with that under certain yield, and show that the optimal order quantity (sales quantity) under random yield is more (less) than that under certain yield.  相似文献   

8.
Finite and infinite planning horizon Markov decision problems are formulated for a class of jump processes with general state and action spaces and controls which are measurable functions on the time axis taking values in an appropriate metrizable vector space. For the finite horizon problem, the maximum expected reward is the unique solution, which exists, of a certain differential equation and is a strongly continuous function in the space of upper semi-continuous functions. A necessary and sufficient condition is provided for an admissible control to be optimal, and a sufficient condition is provided for the existence of a measurable optimal policy. For the infinite horizon problem, the maximum expected total reward is the fixed point of a certain operator on the space of upper semi-continuous functions. A stationary policy is optimal over all measurable policies in the transient and discounted cases as well as, with certain added conditions, in the positive and negative cases.  相似文献   

9.
We present in this paper several asymptotic properties of constrained Markov Decision Processes (MDPs) with a countable state space. We treat both the discounted and the expected average cost, with unbounded cost. We are interested in (1) the convergence of finite horizon MDPs to the infinite horizon MDP, (2) convergence of MDPs with a truncated state space to the problem with infinite state space, (3) convergence of MDPs as the discount factor goes to a limit. In all these cases we establish the convergence of optimal values and policies. Moreover, based on the optimal policy for the limiting problem, we construct policies which are almost optimal for the other (approximating) problems. Based on the convergence of MDPs with a truncated state space to the problem with infinite state space, we show that an optimal stationary policy exists such that the number of randomisations it uses is less or equal to the number of constraints plus one. We finally apply the results to a dynamic scheduling problem.This work was partially supported by the Chateaubriand fellowship from the French embassy in Israel and by the European Grant BRA-QMIPS of CEC DG XIII  相似文献   

10.
Planning horizon is a key issue in production planning. Different from previous approaches based on Markov Decision Processes, we study the planning horizon of capacity planning problems within the framework of stochastic programming. We first consider an infinite horizon stochastic capacity planning model involving a single resource, linear cost structure, and discrete distributions for general stochastic cost and demand data (non-Markovian and non-stationary). We give sufficient conditions for the existence of an optimal solution. Furthermore, we study the monotonicity property of the finite horizon approximation of the original problem. We show that, the optimal objective value and solution of the finite horizon approximation problem will converge to the optimal objective value and solution of the infinite horizon problem, when the time horizon goes to infinity. These convergence results, together with the integrality of decision variables, imply the existence of a planning horizon. We also develop a useful formula to calculate an upper bound on the planning horizon. Then by decomposition, we show the existence of a planning horizon for a class of very general stochastic capacity planning problems, which have complicated decision structure.  相似文献   

11.
We consider the timing of replacement of obsolete subsystems within an extensive, complex infrastructure. Such replacement action, known as capital renewal, must balance uncertainty about future profitability against uncertainty about future renewal costs. Treating renewal investments as real options, we derive an optimal solution to the infinite horizon version of this problem and determine the total present value of an institution’s capital renewal options. We investigate the sensitivity of the infinite horizon solution to variations in key problem parameters and highlight the system scenarios in which timely renewal activity is most profitable. For finite horizon renewal planning, we show that our solution performs better than a policy of constant periodic renewals if more than two renewal cycles are completed.  相似文献   

12.
In this paper, the infinite horizon Markovian decision programming with recursive reward functions is discussed. We show that Bellman's optimal principle is applicable for our model. Then, a sufficient and necessary condition for a policy to be optimal is given. For the stationary case, an iteration algorithm for finding a stationary optimal policy is designed. The algorithm is a generalization of Howard's [7] and Iwamoto's [3] algorithms.This research was supported by the National Natural Science Foundation of China.  相似文献   

13.
In this paper, we consider a single product, periodic review, stochastic demand inventory problem where backorders are allowed and penalized via fixed and proportional backorder costs simultaneously. Fixed backorder cost associates a one-shot penalty with stockout situations whereas proportional backorder cost corresponds to a penalty for each demanded but yet waiting to be satisfied item. We discuss the optimality of a myopic base-stock policy for the infinite horizon case. Critical number of the infinite horizon myopic policy, i.e., the base-stock level, is denoted by S. If the initial inventory is below S then the optimal policy is myopic in general, i.e., regardless of the values of model parameters and demand density. Otherwise, the sufficient condition for a myopic optimum requires some restrictions on demand density or parameter values. However, this sufficient condition is not very restrictive, in the sense that it holds immediately for Erlang demand density family. We also show that the value of S can be computed easily for the case of Erlang demand. This special case is important since most real-life demand densities with coefficient of variation not exceeding unity can well be represented by an Erlang density. Thus, the myopic policy may be considered as an approximate solution, if the exact policy is intractable. Finally, we comment on a generalization of this study for the case of phase-type demands, and identify some related research problems which utilize the results presented here.  相似文献   

14.
In this paper we consider a nonstationary periodic review dynamic production–inventory model with uncertain production capacity and uncertain demand. The maximum production capacity varies stochastically. It is known that order up-to (or base-stock, critical number) policies are optimal for both finite horizon problems and infinite horizon problems. We obtain upper and lower bounds of the optimal order up-to levels, and show that for an infinite horizon problem the upper and the lower bounds of the optimal order up-to levels for the finite horizon counterparts converge as the planning horizons considered get longer. Furthermore, under mild conditions the differences between the upper and the lower bounds converge exponentially to zero.  相似文献   

15.
In this paper we study the exploitation of a one species forest plantation when timber price is governed by a stochastic process. The work focuses on providing closed expressions for the optimal harvesting policy in terms of the parameters of the price process and the discount factor, with finite and infinite time horizon. We assume that harvest is restricted to mature trees older than a certain age and that growth and natural mortality after maturity are neglected. We use stochastic dynamic programming techniques to characterize the optimal policy and we model price using a geometric Brownian motion and an Ornstein–Uhlenbeck process. In the first case we completely characterize the optimal policy for all possible choices of the parameters. In the second case we provide sufficient conditions, based on explicit expressions for reservation prices, assuring that harvesting everything available is optimal. In addition, for the Ornstein–Uhlenbeck case we propose a policy based on a reservation price that performs well in numerical simulations. In both cases we solve the problem for every initial condition and the best policy is obtained endogenously, that is, without imposing any ad hoc restrictions such as maximum sustained yield or convergence to a predefined final state.  相似文献   

16.
We study risk-sensitive control of continuous time Markov chains taking values in discrete state space. We study both finite and infinite horizon problems. In the finite horizon problem we characterize the value function via Hamilton Jacobi Bellman equation and obtain an optimal Markov control. We do the same for infinite horizon discounted cost case. In the infinite horizon average cost case we establish the existence of an optimal stationary control under certain Lyapunov condition. We also develop a policy iteration algorithm for finding an optimal control.  相似文献   

17.
We establish conditions under which a sequence of finite horizon convex programs monotonically increases in value to the value of the infinite program; a subsequence of optimal solutions converges to the optimal solution of the infinite problem. If the conditions we impose fail, then (roughtly) the optimal value of the infinite horizon problem is an improper convex function. Under more restrictive conditions we establish the necessary and sufficient conditions for optimality. This constructive procedure gives us a way to solve the infinite (long range) problem by solving a finite (short range) problem. It appears to work well in practice.  相似文献   

18.
We address a rate control problem associated with a single server Markovian queueing system with customer abandonment in heavy traffic. The controller can choose a buffer size for the queueing system and also can dynamically control the service rate (equivalently the arrival rate) depending on the current state of the system. An infinite horizon cost minimization problem is considered here. The cost function includes a penalty for each rejected customer, a control cost related to the adjustment of the service rate and a penalty for each abandoning customer. We obtain an explicit optimal strategy for the limiting diffusion control problem (the Brownian control problem or BCP) which consists of a threshold-type optimal rejection process and a feedback-type optimal drift control. This solution is then used to construct an asymptotically optimal control policy, i.e. an optimal buffer size and an optimal service rate for the queueing system in heavy traffic. The properties of generalized regulator maps and weak convergence techniques are employed to prove the asymptotic optimality of this policy. In addition, we identify the parameter regimes where the infinite buffer size is optimal.  相似文献   

19.
In this paper we investigate an optimal job, consumption, and investment policy of an economic agent in a continuous and infinite time horizon. The agent’s preference is characterized by the Cobb–Douglas utility function whose arguments are consumption and leisure. We use the martingale method to obtain the closed-form solution for the optimal job, consumption, and portfolio policy. We compare the optimal consumption and investment policy with that in the absence of job choice opportunities.  相似文献   

20.
《Optimization》2012,61(11):2417-2440
We investigate necessary conditions of optimality for the Bolza-type infinite horizon problem with free right end. The optimality is understood in the sense of weakly uniformly overtaking optimal control. No previous knowledge in the asymptotic behaviour of trajectories or adjoint variables is necessary. Following Seierstad’s idea, we obtain the necessary boundary condition at infinity in the form of a transversality condition for the maximum principle. Those transversality conditions may be expressed in the integral form through an Aseev–Kryazhimskii-type formulae for co-state arcs. The connection between these formulae and limiting gradients of pay-off function at infinity is identified; several conditions under which it is possible to explicitly specify the co-state arc through those Aseev–Kryazhimskii-type formulae are found. For infinite horizon problem of Bolza type, an example is given to clarify the use of the Aseev–Kryazhimskii formula as an explicit expression of the co-state arc.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号