期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

In this paper, we study the average optimality for continuous-time controlled jump Markov processes in general state and action spaces. The criterion to be minimized is the average expected costs. Both the transition rates and the cost rates are allowed to be unbounded. We propose another set of conditions under which we first establish one average optimality inequality by using the well-known “vanishing discounting factor approach”. Then, when the cost (or reward) rates are nonnegative (or nonpositive), from the average optimality inequality we prove the existence of an average optimal stationary policy in all randomized history dependent policies by using the Dynkin formula and the Tauberian theorem. Finally, when the cost (or reward) rates have neither upper nor lower bounds, we also prove the existence of an average optimal policy in all (deterministic) stationary policies by constructing a “new” cost (or reward) rate. Research partially supported by the Natural Science Foundation of China (Grant No: 10626021) and the Natural Science Foundation of Guangdong Province (Grant No: 06300957). 相似文献

7.

Convergence of the optimal values of constrained Markov control processes

Jorge Alvarez-Mena Onésimo Hernández-Lerma 《Mathematical Methods of Operations Research》2002,55(3):461-484

相似文献

8.

Admission control for a multi-server queue with abandonment

Yaşar Levent Koçağa Amy R. Ward 《Queueing Systems》2010,65(3):275-323

In a M/M/N+M queue, when there are many customers waiting, it may be preferable to reject a new arrival rather than risk that arrival later abandoning without receiving service. On the other hand, rejecting new arrivals increases the percentage of time servers are idle, which also may not be desirable. We address these trade-offs by considering an admission control problem for a M/M/N+M queue when there are costs associated with customer abandonment, server idleness, and turning away customers. First, we formulate the relevant Markov decision process (MDP), show that the optimal policy is of threshold form, and provide a simple and efficient iterative algorithm that does not presuppose a bounded state space to compute the minimum infinite horizon expected average cost and associated threshold level. Under certain conditions we can guarantee that the algorithm provides an exact optimal solution when it stops; otherwise, the algorithm stops when a provided bound on the optimality gap is reached. Next, we solve the approximating diffusion control problem (DCP) that arises in the Halfin–Whitt many-server limit regime. This allows us to establish that the parameter space has a sharp division. Specifically, there is an optimal solution with a finite threshold level when the cost of an abandonment exceeds the cost of rejecting a customer; otherwise, there is an optimal solution that exercises no control. This analysis also yields a convenient analytic expression for the infinite horizon expected average cost as a function of the threshold level. Finally, we propose a policy for the original system that is based on the DCP solution, and show that this policy is asymptotically optimal. Our extensive numerical study shows that the control that arises from solving the DCP achieves a very similar cost to the control that arises from solving the MDP, even when the number of servers is small. 相似文献

9.

Dynamic productivity improvement in a model with multiple processes

Michael Brock Jørgen Tind 《Mathematical Methods of Operations Research》2001,54(3):387-393

相似文献

10.

Optimality of randomized strategies in a Markovian replacement model

Peter Bruns 《Mathematical Methods of Operations Research》2003,56(3):481-499

相似文献

11.

Sharp-interface limit of the Allen-Cahn action functional in one space dimension

Robert V. Kohn Maria G. Reznikoff Yoshihiro Tonegawa 《Calculus of Variations and Partial Differential Equations》2006,25(4):503-534

We analyze the sharp-interface limit of the action minimization problem for the stochastically perturbed Allen-Cahn equation in one space dimension. The action is a deterministic functional which is linked to the behavior of the stochastic process in the small noise limit. Previously, heuristic arguments and numerical results have suggested that the limiting action should “count” two competing costs: the cost to nucleate interfaces and the cost to propagate them. In addition, constructions have been used to derive an upper bound for the minimal action which was proved optimal on the level of scaling. In this paper, we prove that for d = 1, the upper bound achieved by the constructions is in fact sharp. Furthermore, we derive a lower bound for the functional itself, which is in agreement with the heuristic picture. To do so, we characterize the sharp-interface limit of the space-time energy measures. The proof relies on an extension of earlier results for the related elliptic problem. Mathematics Subject Classification (2000) 49J45, 35R60, 60F10 相似文献

12.

Adaptive control of constrained Markov chains: Criteria and policies

Eitan Altman Adam Shwartz 《Annals of Operations Research》1991,28(1):101-134

We consider the constrained optimization of a finite-state, finite action Markov chain. In the adaptive problem, the transition probabilities are assumed to be unknown, and no prior distribution on their values is given. We consider constrained optimization problems in terms of several cost criteria which are asymptotic in nature. For these criteria we show that it is possible to achieve the same optimal cost as in the non-adaptive case.We first formulate a constrained optimization problem under each of the cost criteria and establish the existence of optimal stationary policies.Since the adaptive problem is inherently non-stationary, we suggest a class ofAsymptotically Stationary (AS) policies, and show that, under each of the cost criteria, the costs of an AS policy depend only on its limiting behavior. This property implies that there exist optimal AS policies. A method for generating adaptive policies is then suggested, which leads to strongly consistent estimators for the unknown transition probabilities. A way to guarantee that these policies are also optimal is to couple them with the adaptive algorithm of [3]. This leads to optimal policies for each of the adaptive constrained optimization problems under discussion.This work was supported in part through United States-Israel Binational Science Foundation Grant BSF 85-00306. 相似文献

13.

Optimal switching problem for countable Markov chains: average reward criterion

Alexander Yushkevich 《Mathematical Methods of Operations Research》2001,53(1):1-24

相似文献

14.

Control policy of a hysteretic queueing system

Tadj Lotfi Ke Jau-Chuan 《Mathematical Methods of Operations Research》2003,57(3):367-376

相似文献

15.

Weighted Markov decision processes with perturbation

Ke Liu Jerzy A. Filar 《Mathematical Methods of Operations Research》2001,53(3):465-480

相似文献

16.

Reward functionals, salvage values, and optimal stopping 总被引：2，自引：0，他引：2

Luis H. R. Alvarez 《Mathematical Methods of Operations Research》2001,54(2):315-337

相似文献

17.

Portfolio optimization under transaction costs in the CRR model

Jörn Sass 《Mathematical Methods of Operations Research》2005,61(2):239-259

相似文献

18.

Conditions for the uniqueness of optimal policies of discounted Markov decision processes

Daniel?Cruz-Suárez Raúl?Montes-de-Oca Email author Francisco?Salem-Silva 《Mathematical Methods of Operations Research》2004,60(3):415-436

相似文献

19.

Dynamic order replenishment policy in internet-based supply chains

Oded Berman Eungab Kim 《Mathematical Methods of Operations Research》2001,53(3):371-390

相似文献

20.

Multigrid algorithms for a vertex–centered covolume method for elliptic problems

So-Hsiang Chou Do Y. Kwak 《Numerische Mathematik》2002,90(3):441-458

Summary. We analyze V–cycle multigrid algorithms for a class of perturbed problems whose perturbation in the bilinear form preserves the convergence properties of the multigrid algorithm of the original problem. As an application, we study the convergence of multigrid algorithms for a covolume method or a vertex–centered finite volume element method for variable coefficient elliptic problems on polygonal domains. As in standard finite element methods, the V–cycle algorithm with one pre-smoothing converges with a rate independent of the number of levels. Various types of smoothers including point or line Jacobi, and Gauss-Seidel relaxation are considered. Received August 19, 1999 / Revised version received July 10, 2000 / Published online June 7, 2001 相似文献