首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 593 毫秒
1.
《Optimization》2012,61(2):255-269
Constrained Markov decision processes with compact state and action spaces are studied under long-run average reward or cost criteria. By introducing a corresponding Lagrange function, a saddle-point theorem is given, by which the existence of a constrained optimal pair of initial state distribution and policy is shown. Also, under the hypothesis of Doeblin, the functional characterization of a constrained optimal policy is obtained  相似文献   

2.
赵玲  刘志学 《运筹与管理》2022,31(6):105-110
为了吸引更多顾客,许多电子商务零售商允许顾客在一定时间内退货,导致其利润明显减少。同时,在补货时不仅产生依赖补货量的变动成本,而且会产生与补货量无关的固定成本。基于此,以最大化电子商务零售商的利润为目标,建立考虑顾客退货和固定成本的联合补货与定价模型,其中顾客的退货量与满足的需求呈正比。在一般需求情形下,部分刻画多期问题的最优策略;在特殊需求情形下,证明(s,S,p)策略对单期问题最优,并对多期问题的最优策略进行严格刻画。根据已有刻画为多期问题构造启发式策略。数值结果表明启发式策略近似最优;当初始库存水平足够高/低时,最优补货水平和定价随退货率与固定成本单调变化。关键词:联合补货与定价模型;顾客退货;固定成本;随机动态规划;最优策略  相似文献   

3.
We consider the problem of scheduling products with components on a single machine, where changeovers incur fixed costs. The objective is to minimize the weighted sum of total flow time and changeover cost. We provide properties of optimal solutions and develop an explicit characterization of optimal sequences, while showing that this characterization has recurrent properties. Our structural results have interesting implications for practitioners, primarily that the structure of optimal sequences is robust to changes in demand.  相似文献   

4.
Stochastic control problems for controlled Markov processes models with an infinite planning horizon are considered, under some non-standard cost criteria. The classical discounted and average cost criteria can be viewed as complementary, in the sense that the former captures the short-time and the latter the long-time performance of the system. Thus, we study a cost criterion obtained as weighted combinations of these criteria, extending to a general state and control space framework several recent results by Feinberg and Shwartz, and by Krass et al. In addition, a functional characterization is given for overtaking optimal policies, for problems with countable state spaces and compact control spaces; our approach is based on qualitative properties of the optimality equation for problems with an average cost criterion.Research partially supported by the Engineering Foundation under grant RI-A-93-10, in part by the National Science Foundation under grant NSF-INT 9201430, and in part by a grant from the AT&T Foundation.Research partially supported by the Air Force Office of Scientific Research under Grant F49620-92-J-0045, and in part by the National Science Foundation under Grant CDR-8803012.  相似文献   

5.
This paper deals with discrete-time Markov control processes withBorel state and control spaces, with possiblyunbounded costs andnoncompact control constraint sets, and the average cost criterion. Conditions are given for the convergence of the value iteration algorithm to the optimal average cost, and for a sequence of finite-horizon optimal policies to have an accumulation point which is average cost optimal.This research was partially supported by the Consejo Nacional de Ciencia y Tecnología (CONACyT) under grant 1332-E9206.  相似文献   

6.
We study stochastic control problem for pure jump processes on a general state space with risk sensitive discounted and ergodic cost criteria. For the discounted cost criterion we prove the existence and Hamilton–Jacobi–Bellman characterization of optimal α-discounted control for bounded cost function. For the ergodic cost criterion we assume a Lyapunov type stability assumption and a small cost condition. Under these assumptions we show the existence of the optimal risk-sensitive ergodic control.  相似文献   

7.
One of the most fundamental results in inventory theoryis the optimality of (s, S) policy for inventory systems withsetup cost. This result is established based on a key assumptionof infinite production/ordering capacity. Several studies haveshown that, when there is a finite production/ordering capacity,the optimal policy for the inventory system is very complicatedand indeed, only partial characterization for the optimal policyis possible. In this paper, we consider a continuous reviewinventory system with finite production/ordering capacity andsetup cost, and show that the optimal control policy for thissystem has a very simple structure. We also develop efficientalgorithms to compute the optimal control parameters.  相似文献   

8.
This paper studies discrete-time nonlinear controlled stochastic systems, modeled by controlled Markov chains (CMC) with denumerable state space and compact action space, and with an infinite planning horizon. Recently, there has been a renewed interest in CMC with a long-run, expected average cost (AC) optimality criterion. A classical approach to study average optimality consists in formulating the AC case as a limit of the discounted cost (DC) case, as the discount factor increases to 1, i.e., as the discounting effectvanishes. This approach has been rekindled in recent years, with the introduction by Sennott and others of conditions under which AC optimal stationary policies are shown to exist. However, AC optimality is a rather underselective criterion, which completely neglects the finite-time evolution of the controlled process. Our main interest in this paper is to study the relation between the notions of AC optimality andstrong average cost (SAC) optimality. The latter criterion is introduced to asses the performance of a policy over long but finite horizons, as well as in the long-run average sense. We show that for bounded one-stage cost functions, Sennott's conditions are sufficient to guarantee thatevery AC optimal policy is also SAC optimal. On the other hand, a detailed counterexample is given that shows that the latter result does not extend to the case of unbounded cost functions. In this counterexample, Sennott's conditions are verified and a policy is exhibited that is both average and Blackwell optimal and satisfies the average cost inequality.  相似文献   

9.
This paper is concerned with the optimal production planning in a dynamic stochastic manufacturing system consisting of a single machine that is failure prone and facing a constant demand. The objective is to choose the rate of production over time in order to minimize the long-run average cost of production and surplus. The analysis proceeds with a study of the corresponding problem with a discounted cost. It is shown using the vanishing discount approach that the Hamilton–Jacobi–Bellman equation for the average cost problem has a solution giving rise to the minimal average cost and the so-called potential function. The result helps in establishing a verification theorem. Finally, the optimal control policy is specified in terms of the potential function.  相似文献   

10.
We study optimal control of Markov processes with age-dependent transition rates. The control policy is chosen continuously over time based on the state of the process and its age. We study infinite horizon discounted cost and infinite horizon average cost problems. Our approach is via the construction of an equivalent semi-Markov decision process. We characterise the value function and optimal controls for both discounted and average cost cases.  相似文献   

11.
研究可数状态空间任意行动空间非一致性有界费用马氏决策过程(MDP)的强平均最优,给出了使得每个常用的平均最优策略也是强平均最优的条件,并实质性的推广了Cavazos-Cadena和Fernandez-Gaucheran(Math. Meth. Oper. Res., 1996, 43: 281-300)的主要结果.  相似文献   

12.
We consider a two-stage adaptive linear optimization problem under right hand side uncertainty with a min–max objective and give a sharp characterization of the power and limitations of affine policies (where the second stage solution is an affine function of the right hand side uncertainty). In particular, we show that the worst-case cost of an optimal affine policy can be times the worst-case cost of an optimal fully-adaptable solution for any δ > 0, where m is the number of linear constraints. We also show that the worst-case cost of the best affine policy is times the optimal cost when the first-stage constraint matrix has non-negative coefficients. Moreover, if there are only k ≤ m uncertain parameters, we generalize the performance bound for affine policies to , which is particularly useful if only a few parameters are uncertain. We also provide an -approximation algorithm for the general case without any restriction on the constraint matrix but the solution is not an affine function of the uncertain parameters. We also give a tight characterization of the conditions under which an affine policy is optimal for the above model. In particular, we show that if the uncertainty set, is a simplex, then an affine policy is optimal. However, an affine policy is suboptimal even if is a convex combination of only (m + 3) extreme points (only two more extreme points than a simplex) and the worst-case cost of an optimal affine policy can be a factor (2 − δ) worse than the worst-case cost of an optimal fully-adaptable solution for any δ > 0.  相似文献   

13.
This paper considers a periodic-review shuttle service system with random customer demands and finite reposition capacity. The objective is to find the optimal stationary policy of empty container reposition by minimizing the sum of container leasing cost, inventory cost and reposition cost. Using Markov decision process approach, the structures of the optimal stationary policies for both expected discounted cost and long-run average cost are completely characterized. Monotonic and asymptotic behaviours of the optimal policy are established. By taking advantage of special structure of the optimal policy, the stationary distribution of the system states is obtained, which is then used to compute interesting steady-state performance measures and implement the optimal policy. Numerical examples are given to demonstrate the results.  相似文献   

14.
We obtain a linear programming characterization for the minimum cost associated with finite dimensional reflected optimal control problems. In order to describe the value functions, we employ an infinite dimensional dual formulation instead of using the characterization via Hamilton-Jacobi partial differential equations. In this paper we consider control problems with both infinite and finite horizons. The reflection is given by the normal cone to a proximal retract set.  相似文献   

15.
We consider the optimization of finite-state, finite-action Markov decision processes under constraints. Costs and constraints are of the discounted or average type, and possibly finite-horizon. We investigate the sensitivity of the optimal cost and optimal policy to changes in various parameters. We relate several optimization problems to a generic linear program, through which we investigate sensitivity issues. We establish conditions for the continuity of the optimal value in the discount factor. In particular, the optimal value and optimal policy for the expected average cost are obtained as limits of the dicounted case, as the discount factor goes to one. This generalizes a well-known result for the unconstrained case. We also establish the continuity in the discount factor for certain non-stationary policies. We then discuss the sensitivity of optimal policies and optimal values to small changes in the transition matrix and in the instantaneous cost functions. The importance of the last two results is related to the performance of adaptive policies for constrained MDP under various cost criteria [3,5]. Finally, we establish the convergence of the optimal value for the discounted constrained finite horizon problem to the optimal value of the corresponding infinite horizon problem.  相似文献   

16.
This paper considers an optimal maintenance policy for a practical and reparable deteriorating system subject to random shocks. Modeling the repair time by a geometric process and the failure mechanism by a generalized δ-shock process, we develop an explicit expression of the long-term average cost per time unit for the system under a threshold-type replacement policy. Based on this average cost function, we propose a finite search algorithm to locate the optimal replacement policy N to minimize the average cost rate. We further prove that the optimal policy N is unique and present some numerical examples. Many practical systems fit the model developed in this paper.  相似文献   

17.
This paper studies the policy iteration algorithm (PIA) for average cost Markov control processes on Borel spaces. Two classes of MCPs are considered. One of them allows some restricted-growth unbounded cost functions and compact control constraint sets; the other one requires strictly unbounded costs and the control constraint sets may be non-compact. For each of these classes, the PIA yields, under suitable assumptions, the optimal (minimum) cost, an optimal stationary control policy, and a solution to the average cost optimality equation.  相似文献   

18.
19.
An optimal replacement policy for a multistate degenerative simple system   总被引:1,自引:0,他引:1  
In this paper, a degenerative simple system (i.e. a degenerative one-component system with one repairman) with k + 1 states, including k failure states and one working state, is studied. Assume that the system after repair is not “as good as new”, and the degeneration of the system is stochastic. Under these assumptions, we consider a new replacement policy T based on the system age. Our problem is to determine an optimal replacement policy T such that the average cost rate (i.e. the long-run average cost per unit time) of the system is minimized. The explicit expression of the average cost rate is derived, the corresponding optimal replacement policy can be determined, the explicit expression of the minimum of the average cost rate can be found and under some mild conditions the existence and uniqueness of the optimal policy T can be proved, too. Further, we can show that the repair model for the multistate system in this paper forms a general monotone process repair model which includes the geometric process repair model as a special case. We can also show that the repair model in the paper is equivalent to a geometric process repair model for a two-state degenerative simple system in the sense that they have the same average cost rate and the same optimal policy. Finally, a numerical example is given to illustrate the theoretical results of this model.  相似文献   

20.
In this paper we study optimal stopping and impulse control with a long-run average cost functional for ergodic Markov processes, which transition semigroupP t converges uniformly on compact sets to a unique invariant measure, ast. We restrict the class of strategies to so-called stopping rules and obtain continuity of value functions and characterization of optimal rules.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号