期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A counterexample on overtaking optimality

Andrzej S. Nowak Oscar Vega-Amaya 《Mathematical Methods of Operations Research》1999,49(3):435-439

相似文献

2.

Blackwell Optimality in the Class of Markov Policies for Continuous-Time Controlled Markov Chains

Tomás Prieto-Rumeau 《Acta Appl Math》2006,92(1):77-96

This paper deals with Blackwell optimality for continuous-time controlled Markov chains with compact Borel action space, and possibly unbounded reward (or cost) rates and unbounded transition rates. We prove the existence of a deterministic stationary policy which is Blackwell optimal in the class of all admissible (nonstationary) Markov policies, thus extending previous results that analyzed Blackwell optimality in the class of stationary policies. We compare our assumptions to the corresponding ones for discrete-time Markov controlled processes. 相似文献

3.

Notes on equivalent stationary policies in Markov decision processes with total rewards

Eugene A. Feinberg Isaac M. Sonin 《Mathematical Methods of Operations Research》1996,44(2):205-221

We construct examples of Markov Decision Processes for which, for a given initial state and for a given nonstationary transient policy, there is no equivalent (randomized) stationary policy, i.e. there is no stationary policy which occupation measure is equal to the occupation measure of a given policy. We also investigate the relation between the existence of equivalent stationary policies in special models and the existence of equivalent strategies in various classes of nonstationary policies in general models. 相似文献

4.

具有平均费用的非平稳Markov决策过程

魏力仁《经济数学》1995,(1)

本文研究了在一般状态空间具有平均费用的非平稳Ｍａｒｋｏｖ决策过程，把在平稳情形用补充的折扣模型的最优方程来建立平均费用的最优方程的结果，推广到非平稳的情形．利用这个结果证明了最优策略的存在性．相似文献

5.

Policy iteration for robust nonstationary Markov decision processes

Saumya Sinha Archis Ghate 《Optimization Letters》2016,10(8):1613-1628

Policy iteration is a well-studied algorithm for solving stationary Markov decision processes (MDPs). It has also been extended to robust stationary MDPs. For robust nonstationary MDPs, however, an “as is” execution of this algorithm is not possible because it would call for an infinite amount of computation in each iteration. We therefore present a policy iteration algorithm for robust nonstationary MDPs, which performs finitely implementable approximate variants of policy evaluation and policy improvement in each iteration. We prove that the sequence of cost-to-go functions produced by this algorithm monotonically converges pointwise to the optimal cost-to-go function; the policies generated converge subsequentially to an optimal policy. 相似文献

6.

Deviation Matrix,Laurent Series and Blackwell Optimality in Countable State Markov Decision Processes

《Optimization》2012,61(1):191-202

This paper presents a recurrent condition on Markov decision processes with a countable state space and bounded rewards. The condition is sufficient for the existence of a Blackwell optimal stationary policy, having the Laurent series expansion with continuous coefficients. It is so relaxed that the Markov chain corresponding to a stationary policy may have countably many periodic recurrent classes. Our method finds the deviation matrix in an explicit form. 相似文献

7.

Optimal Pricing and Admission Control in a Queueing System with Periodically Varying Parameters 总被引：2，自引：1，他引：1

Yoon Seunghwan Lewis Mark E. 《Queueing Systems》2004,47(3):177-199

We consider congestion control in a nonstationary queueing system. Assuming that the arrival and service rates are bounded, periodic functions of time, a Markov decision process (MDP) formulation is developed. We show under the infinite horizon discounted and average reward optimality criteria, for each fixed time, optimal pricing and admission control strategies are nondecreasing in the number of customers in the system. This extends stationary results to the nonstationary setting. Despite this result, the problem still seems intractable. We propose an easily implementable pointwise stationary approximation (PSA) to approximate the optimal policies, suggest a heuristic to improve the implementation of the PSA and verify its usefulness via a numerical study. 相似文献

8.

A new strong optimality criterion for nonstationary Markov decision processes

Xianping Guo Peng Shi Weiping Zhu 《Mathematical Methods of Operations Research》2000,52(2):287-306

This paper deals with a new optimality criterion consisting of the usual three average criteria and the canonical triplet (totally so-called strong average-canonical optimality criterion) and introduces the concept of a strong average-canonical policy for nonstationary Markov decision processes, which is an extension of the canonical policies of Herna′ndez-Lerma and Lasserre [16] (pages: 77) for the stationary Markov controlled processes. For the case of possibly non-uniformly bounded rewards and denumerable state space, we first construct, under some conditions, a solution to the optimality equations (OEs), and then prove that the Markov policies obtained from the OEs are not only optimal for the three average criteria but also optimal for all finite horizon criteria with a sequence of additional functions as their terminal rewards (i.e. strong average-canonical optimal). Also, some properties of optimal policies and optimal average value convergence are discussed. Moreover, the error bound in average reward between a rolling horizon policy and a strong average-canonical optimal policy is provided, and then a rolling horizon algorithm for computing strong average ε(>0)-optimal Markov policies is given. 相似文献

9.

Markov decision processes with a stopping time constraint

Masayuki Horiguchi 《Mathematical Methods of Operations Research》2001,53(2):279-295

相似文献

10.

Graph-theoretic and algebraic characterizations of some Markov processes

A. Paz 《Israel Journal of Mathematics》1963,1(3):169-180

An algebraic decidable condition for a stationary Markov chain to consist of a single ergodic set, and a graph-theoretic decidable condition for a stationary Markov chain to consist of a single ergodic noncyclic set are formulated. In the third part of the paper a graph-theoretic condition for a nonstationary Markov chain to have the weakly-ergodic property is given. The paper is based on part of the author’s work towards the D. Sc. degree. 相似文献

11.

Stochastic processes indexed by hypergroups. I

R. Lasser M. Leitner 《Journal of Theoretical Probability》1989,2(3):301-311

For many applications it is desirable to have extensions of the classical theory of weakly stationary processes to certain classes of nonstationary ones. A large family for which Fourier methods still play a major role is the harmonizable class and some related processes. The purpose of this paper is to initiate the study of a class of stochastic processes with a stationarity condition based on the notion of hypergroups. 相似文献

12.

Average cost Markov decision processes under the hypothesis of Doeblin

Masami Kurano 《Annals of Operations Research》1991,29(1):375-385

Average cost Markov decision processes (MDPs) with compact state and action spaces and bounded lower semicontinuous cost functions are considered. Kurano [7] has treated the general case in which several ergodic classes and a transient set are permitted for the Markov process induced by any randomized stationary policy under the hypothesis of Doeblin and showed the existence of a minimum pair of state and policy. This paper considers the same case as that discussed in Kurano [7] and proves some new results which give the existence theorem of an optimal stationary policy under some reasonable conditions. 相似文献

13.

Constrained Markov decision processes with total cost criteria: Occupation measures and primal LP

Eitan Altman 《Mathematical Methods of Operations Research》1996,43(1):45-72

This paper is the third in a series on constrained Markov decision processes (CMDPs) with a countable state space and unbounded cost. In the previous papers we studied the expected average and the discounted cost. We analyze in this paper the total cost criterion. We study the properties of the set of occupation measures achieved by different classes of policies; we then focus on stationary policies and on mixed deterministic policies and present conditions under which optimal policies exist within these classes. We conclude by introducing an equivalent infinite Linear Program. 相似文献

14.

Optimal preventive maintenance of a production system with an intermediate buffer

《European Journal of Operational Research》2006,168(1):86-99

In this paper we consider a model consisting of a deteriorating installation that transfers a raw material to a production unit and a buffer which has been built between the installation and the production unit. The deterioration process of the installation is considered to be nonstationary, i.e. the transition probabilities may depend not only on the working conditions of the installation but on its age as well. The problem of the optimal preventive maintenance of the installation is considered. Under a suitable cost structure it is shown that, for fixed age of the installation and fixed buffer level, the optimal policy is of control-limit type. When the deterioration process is stationary, an efficient Markov decision algorithm operating on the class of control-limit policies is developed. There is strong numerical evidence that the algorithm converges to the optimal policy. Two generalizations of this model are also discussed. 相似文献

15.

MARKOVIAN DECISION PROGRAMMING WITH RECURSIVE VECTOR-REWARD

刘建庸刘克《应用数学学报(英文版)》1990,6(2):158-165

In this paper, we discuss Markovian decision programming with recursive vector-reward andgive an algorithm to find optimal policies. We prove that: (1) There is a Markovian optimal policy for the nonstationary case; (2) Thereis a stationary optimal policy for the stationary case. 相似文献

16.

On reduction of some classes of partial differential equations to equations with fewer variables and exact solutions

Yu. V. Zasorin 《Siberian Mathematical Journal》2006,47(4):653-658

We establish a connection between the fundamental solutions to some classes of linear nonstationary partial differential equations and the fundamental solutions to other nonstationary equations with fewer variables. In particular, reduction enables us to obtain exact formulas for the fundamental solutions of some spatial nonstationary equations of mathematical physics (for example, the Kadomtsev-Petviashvili equation, the Kelvin-Voigt equation, etc.) from the available fundamental solutions to one-dimensional stationary equations. 相似文献

17.

Combinatorial systems defined over one- and two-letter alphabets

Charles E. Hughes W. E. Singletary 《Archive for Mathematical Logic》1975,17(1-2):25-33

In this paper we shall investigate some families of decision problems associated with a number of combinatorial systems whose alphabets are restricted to one and two letters. Our purpose is to try to gain a better understanding of the boundaries between classes of combinatorial systems whose decision problems are solvable and those whose decision problems are of any prescribed r.e. many-one degree of unsolvability. We show that, for one-letter alphabets, the decision problems considered here for semi-Thue systems, Thue systems, Post normal systems, Markov algorithms and Post correspondence classes are each solvable. In contrast, for two-letter alphabets, each such family of decision problems represents every r.e. many-one degree of unsolvability. 相似文献

18.

Value iteration and approximately optimal stationary policies in finite-state average Markov decision chains

Rolando Cavazos-Cadena Rolando Cavazos-Cadena 《Mathematical Methods of Operations Research》2002,56(2):181-196

相似文献

19.

Blackwell optimality in the class of all policies in Markov decision chains with a Borel state space and unbounded rewards

Arie Hordijk Alexander A. Yushkevich 《Mathematical Methods of Operations Research》1999,50(3):421-448

相似文献

20.

Open queueing networks in discrete time-some limit theorems

Vinod Sharma 《Queueing Systems》1993,14(1-2):159-175

A finite number of nodes, each with a single server and infinite buffers, is considered in discrete time. The service may be FIFO and the service times are constant. The external arrivals and the routing decision variables form a general stationary sequence. Stability of the system is proved under these assumptions. Extension to multiple servers at a node and general stationary distributions holds. If the external input is i.i.d. and the routing is Markovian then stochastic ordering, continuity of stationary distributions, rates of convergence, a functional CLT and a functional LIL and various other limit theorems for the queue length process are also proved. Generalizations to multiple servers at nodes, customers with priority, multiple customer classes, general service length and Markov modulated external arrival cases are discussed. 相似文献