首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 125 毫秒
1.
2.
This paper deals with Blackwell optimality for continuous-time controlled Markov chains with compact Borel action space, and possibly unbounded reward (or cost) rates and unbounded transition rates. We prove the existence of a deterministic stationary policy which is Blackwell optimal in the class of all admissible (nonstationary) Markov policies, thus extending previous results that analyzed Blackwell optimality in the class of stationary policies. We compare our assumptions to the corresponding ones for discrete-time Markov controlled processes.  相似文献   

3.
We construct examples of Markov Decision Processes for which, for a given initial state and for a given nonstationary transient policy, there is no equivalent (randomized) stationary policy, i.e. there is no stationary policy which occupation measure is equal to the occupation measure of a given policy. We also investigate the relation between the existence of equivalent stationary policies in special models and the existence of equivalent strategies in various classes of nonstationary policies in general models.  相似文献   

4.
本文研究了在一般状态空间具有平均费用的非平稳Markov决策过程,把在平稳情形用补充的折扣模型的最优方程来建立平均费用的最优方程的结果,推广到非平稳的情形.利用这个结果证明了最优策略的存在性.  相似文献   

5.
Policy iteration is a well-studied algorithm for solving stationary Markov decision processes (MDPs). It has also been extended to robust stationary MDPs. For robust nonstationary MDPs, however, an “as is” execution of this algorithm is not possible because it would call for an infinite amount of computation in each iteration. We therefore present a policy iteration algorithm for robust nonstationary MDPs, which performs finitely implementable approximate variants of policy evaluation and policy improvement in each iteration. We prove that the sequence of cost-to-go functions produced by this algorithm monotonically converges pointwise to the optimal cost-to-go function; the policies generated converge subsequentially to an optimal policy.  相似文献   

6.
《Optimization》2012,61(1):191-202
This paper presents a recurrent condition on Markov decision processes with a countable state space and bounded rewards. The condition is sufficient for the existence of a Blackwell optimal stationary policy, having the Laurent series expansion with continuous coefficients. It is so relaxed that the Markov chain corresponding to a stationary policy may have countably many periodic recurrent classes. Our method finds the deviation matrix in an explicit form.  相似文献   

7.
Yoon  Seunghwan  Lewis  Mark E. 《Queueing Systems》2004,47(3):177-199
We consider congestion control in a nonstationary queueing system. Assuming that the arrival and service rates are bounded, periodic functions of time, a Markov decision process (MDP) formulation is developed. We show under the infinite horizon discounted and average reward optimality criteria, for each fixed time, optimal pricing and admission control strategies are nondecreasing in the number of customers in the system. This extends stationary results to the nonstationary setting. Despite this result, the problem still seems intractable. We propose an easily implementable pointwise stationary approximation (PSA) to approximate the optimal policies, suggest a heuristic to improve the implementation of the PSA and verify its usefulness via a numerical study.  相似文献   

8.
This paper deals with a new optimality criterion consisting of the usual three average criteria and the canonical triplet (totally so-called strong average-canonical optimality criterion) and introduces the concept of a strong average-canonical policy for nonstationary Markov decision processes, which is an extension of the canonical policies of Herna′ndez-Lerma and Lasserre [16] (pages: 77) for the stationary Markov controlled processes. For the case of possibly non-uniformly bounded rewards and denumerable state space, we first construct, under some conditions, a solution to the optimality equations (OEs), and then prove that the Markov policies obtained from the OEs are not only optimal for the three average criteria but also optimal for all finite horizon criteria with a sequence of additional functions as their terminal rewards (i.e. strong average-canonical optimal). Also, some properties of optimal policies and optimal average value convergence are discussed. Moreover, the error bound in average reward between a rolling horizon policy and a strong average-canonical optimal policy is provided, and then a rolling horizon algorithm for computing strong average ε(>0)-optimal Markov policies is given.  相似文献   

9.
10.
An algebraic decidable condition for a stationary Markov chain to consist of a single ergodic set, and a graph-theoretic decidable condition for a stationary Markov chain to consist of a single ergodic noncyclic set are formulated. In the third part of the paper a graph-theoretic condition for a nonstationary Markov chain to have the weakly-ergodic property is given. The paper is based on part of the author’s work towards the D. Sc. degree.  相似文献   

11.
For many applications it is desirable to have extensions of the classical theory of weakly stationary processes to certain classes of nonstationary ones. A large family for which Fourier methods still play a major role is the harmonizable class and some related processes. The purpose of this paper is to initiate the study of a class of stochastic processes with a stationarity condition based on the notion of hypergroups.  相似文献   

12.
Average cost Markov decision processes (MDPs) with compact state and action spaces and bounded lower semicontinuous cost functions are considered. Kurano [7] has treated the general case in which several ergodic classes and a transient set are permitted for the Markov process induced by any randomized stationary policy under the hypothesis of Doeblin and showed the existence of a minimum pair of state and policy. This paper considers the same case as that discussed in Kurano [7] and proves some new results which give the existence theorem of an optimal stationary policy under some reasonable conditions.  相似文献   

13.
This paper is the third in a series on constrained Markov decision processes (CMDPs) with a countable state space and unbounded cost. In the previous papers we studied the expected average and the discounted cost. We analyze in this paper the total cost criterion. We study the properties of the set of occupation measures achieved by different classes of policies; we then focus on stationary policies and on mixed deterministic policies and present conditions under which optimal policies exist within these classes. We conclude by introducing an equivalent infinite Linear Program.  相似文献   

14.
In this paper we consider a model consisting of a deteriorating installation that transfers a raw material to a production unit and a buffer which has been built between the installation and the production unit. The deterioration process of the installation is considered to be nonstationary, i.e. the transition probabilities may depend not only on the working conditions of the installation but on its age as well. The problem of the optimal preventive maintenance of the installation is considered. Under a suitable cost structure it is shown that, for fixed age of the installation and fixed buffer level, the optimal policy is of control-limit type. When the deterioration process is stationary, an efficient Markov decision algorithm operating on the class of control-limit policies is developed. There is strong numerical evidence that the algorithm converges to the optimal policy. Two generalizations of this model are also discussed.  相似文献   

15.
In this paper, we discuss Markovian decision programming with recursive vector-reward andgive an algorithm to find optimal policies. We prove that: (1) There is a Markovian optimal policy for the nonstationary case; (2) Thereis a stationary optimal policy for the stationary case.  相似文献   

16.
We establish a connection between the fundamental solutions to some classes of linear nonstationary partial differential equations and the fundamental solutions to other nonstationary equations with fewer variables. In particular, reduction enables us to obtain exact formulas for the fundamental solutions of some spatial nonstationary equations of mathematical physics (for example, the Kadomtsev-Petviashvili equation, the Kelvin-Voigt equation, etc.) from the available fundamental solutions to one-dimensional stationary equations.  相似文献   

17.
In this paper we shall investigate some families of decision problems associated with a number of combinatorial systems whose alphabets are restricted to one and two letters. Our purpose is to try to gain a better understanding of the boundaries between classes of combinatorial systems whose decision problems are solvable and those whose decision problems are of any prescribed r.e. many-one degree of unsolvability. We show that, for one-letter alphabets, the decision problems considered here for semi-Thue systems, Thue systems, Post normal systems, Markov algorithms and Post correspondence classes are each solvable. In contrast, for two-letter alphabets, each such family of decision problems represents every r.e. many-one degree of unsolvability.  相似文献   

18.
19.
20.
Vinod Sharma 《Queueing Systems》1993,14(1-2):159-175
A finite number of nodes, each with a single server and infinite buffers, is considered in discrete time. The service may be FIFO and the service times are constant. The external arrivals and the routing decision variables form a general stationary sequence. Stability of the system is proved under these assumptions. Extension to multiple servers at a node and general stationary distributions holds. If the external input is i.i.d. and the routing is Markovian then stochastic ordering, continuity of stationary distributions, rates of convergence, a functional CLT and a functional LIL and various other limit theorems for the queue length process are also proved. Generalizations to multiple servers at nodes, customers with priority, multiple customer classes, general service length and Markov modulated external arrival cases are discussed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号