首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
This paper provides sufficient conditions when certain information about the past of a stochastic decision processes can be ignored by a controller. We illustrate the results with particular applications to queueing control, control of semi-Markov decision processes with iid sojourn times, and uniformization of continuous-time Markov decision processes. Mathematics Subject Classification (2000): Primary 60K25, Secondary 90C40  相似文献   

2.
The paper deals with continuous time Markov decision processes on a fairly general state space. The economic criterion is the long-run average return. A set of conditions is shown to be sufficient for a constant g to be optimal average return and a stationary policy π1 to be optimal. This condition is shown to be satisfied under appropriate assumptions on the optimal discounted return function. A policy improvement algorithm is proposed and its convergence to an optimal policy is proved.  相似文献   

3.
Finite and infinite planning horizon Markov decision problems are formulated for a class of jump processes with general state and action spaces and controls which are measurable functions on the time axis taking values in an appropriate metrizable vector space. For the finite horizon problem, the maximum expected reward is the unique solution, which exists, of a certain differential equation and is a strongly continuous function in the space of upper semi-continuous functions. A necessary and sufficient condition is provided for an admissible control to be optimal, and a sufficient condition is provided for the existence of a measurable optimal policy. For the infinite horizon problem, the maximum expected total reward is the fixed point of a certain operator on the space of upper semi-continuous functions. A stationary policy is optimal over all measurable policies in the transient and discounted cases as well as, with certain added conditions, in the positive and negative cases.  相似文献   

4.
5.
We prove a version of Pontryagin's maximum principle for time and norm optimal control of linear diffusion processes. This result includes both necessary and sufficient conditions and implies a ``concentration principle' for the optimal measure-valued controls.  相似文献   

6.
This paper is devoted to studying continuous-time Markov decision processes with general state and action spaces, under the long-run expected average reward criterion. The transition rates of the underlying continuous-time Markov processes are allowed to be unbounded, and the reward rates may have neither upper nor lower bounds. We provide new sufficient conditions for the existence of average optimal policies. Moreover, such sufficient conditions are imposed on the controlled process’ primitive data and thus they are directly verifiable. Finally, we apply our results to two new examples.  相似文献   

7.
群体多目标决策联合有效解类的几何特性   总被引:2,自引:0,他引:2  
群体多目标决策是群体决策和多目标决策的一个交叉研究领域,借助供选方案的有效数,文[1]引进了群体多目标决策问题的联合有效解类概念,并且建立了这些解类的K-T最优条件,本文研究这类解的几何特性,得到若干基本的必要条件一充分条件。  相似文献   

8.
Structural properties of stochastic dynamic programs are essential to understanding the nature of the solutions and in deriving appropriate approximation techniques. We concentrate on a class of multidimensional Markov decision processes and derive sufficient conditions for the monotonicity of the value functions. We illustrate our result in the case of the multiproduct batch dispatch (MBD) problem.  相似文献   

9.
We consider time discrete systems which are described by a system of difference equations. The related discrete optimal control problems are introduced. Additionally, a gametheoretic extension is derived, which leads to general multicriteria decision problems. The characterization of their optimal behavior is studied. Given starting and final states define the decision process; applying dynamic programming techniques suitable optimal solutions can be gained. We generalize that approach to a special gametheoretic decision procedure on networks. We characterize Nash equilibria and present sufficient conditions for their existence. A constructive algorithm is derived. The sufficient conditions are exploited to get the algorithmic solution. Its complexity analysis is presented and at the end we conclude with an extension to the complementary case of Pareto optima.Dmitrii Lozovanu was Supported by BGP CRDF-MRDA MOM2-3049-CS-03.  相似文献   

10.
In this paper we discuss the discrete time non-homogeneous discounted Markovian decision programming, where the state space and all action sets are countable. Suppose that the optimum value function is finite. We give the necessary and sufficient conditions for the existence of an optimal policy. Suppose that the absolute mean of rewards is relatively bounded. We also give the necessary and sufficient conditions for the existence of an optimal policy.  相似文献   

11.
This note describes sufficient conditions under which total-cost and average-cost Markov decision processes (MDPs) with general state and action spaces, and with weakly continuous transition probabilities, can be reduced to discounted MDPs. For undiscounted problems, these reductions imply the validity of optimality equations and the existence of stationary optimal policies. The reductions also provide methods for computing optimal policies. The results are applied to a capacitated inventory control problem with fixed costs and lost sales.  相似文献   

12.
研究可数状态空间任意行动空间非一致性有界费用马氏决策过程(MDP)的强平均最优,给出了使得每个常用的平均最优策略也是强平均最优的条件,并实质性的推广了Cavazos-Cadena和Fernandez-Gaucheran(Math. Meth. Oper. Res., 1996, 43: 281-300)的主要结果.  相似文献   

13.
本文通过与生灭过程击中时矩的比较和随机可比的方法分别得出有限生单死过程各种遍历性的充分条件和必要条件. 文末, 讨论了一个例子的各种遍历性.  相似文献   

14.
In this paper, the dynamical behavior of a hybrid switching SIS epidemic model with vaccination and Lévy jumps is considered. Besides a standard geometric Brownian motion, another two driving processes are taken into account: a stationary Poisson point process and a continuous time finite-state Markov chain. Firstly, we establish sufficient conditions for persistence in the mean of the disease. Then we obtain sufficient conditions for extinction of the disease. In addition, we also establish sufficient conditions for the existence of positive recurrence of the solutions to the model by constructing a suitable stochastic Lyapunov function with regime switching.  相似文献   

15.
This paper studies both the average sample-path reward (ASPR) criterion and the limiting average variance criterion for denumerable discrete-time Markov decision processes. The rewards may have neither upper nor lower bounds. We give sufficient conditions on the system’s primitive data and under which we prove the existence of ASPR-optimal stationary policies and variance optimal policies. Our conditions are weaker than those in the previous literature. Moreover, our results are illustrated by a controlled queueing system. Research partially supported by the Natural Science Foundation of Guangdong Province (Grant No: 06025063) and the Natural Science Foundation of China (Grant No: 10626021).  相似文献   

16.
We provide weak sufficient conditions for a full-service policy to be optimal in a queueing control problem in which the service rate is a dynamic decision variable. In our model there are service costs and holding costs and the objective is to minimize the expected total discounted cost over an infinite horizon. We begin with a semi-Markov decision model for a single-server queue with exponentially distributed inter-arrival and service times. Then we present a general model with weak probabilistic assumptions and demonstrate that the full-service policy minimizes both finite-horizon and infinite-horizon total discounted cost on each sample path.  相似文献   

17.
Markovian decision processes are considered in the situation of discrete time, countable state space, and general decision space. By introducing a Banach space with a weighted supremum norm, conditions are derived, which guarantee convergence of successive approximations to the value function. These conditions are weaker then those required by the usual supnorm approach. Several properties of the successive approximations are derived.  相似文献   

18.
广生灭过程的遍历性及平稳分布   总被引:1,自引:0,他引:1  
文献[1]研究了广生灭过程的向上积分型随机泛函,得到了广生灭过程的若干数字特征以及常返的充要条件,该文讨论广生灭过程向下积分型随机泛函,给出了广生灭过程遍历的充要条件以及平均返回时间的计算公式,并在遍历的条件下求出了广生灭过程的平稳分布.  相似文献   

19.
Constructibility is a combinatorial property of simplicial complexes. In general, it requires a great deal of time to decide whether a simplicial complex is constructible or not. In this paper, we consider sufficient conditions for nonconstructibility of simplicial 3-balls to investigate efficient algorithms for the decision problem.  相似文献   

20.
A model of the food chain chemostat involving predator, prey and growth-limiting nutrients is considered. The model incorporates two discrete time delays in order to describe the time involved in converting processes. The Lotka–Volterra type increasing functions are used to describe the species uptakes. In addition to showing that solutions with positive initial conditions are positive and bounded, we establish sufficient conditions for the (i) local stability and instability of the positive equilibrium and (ii) global stability of the non-negative equilibria. Numerical simulation suggests that the delays have both destabilizing and stabilizing effects, and the system can produce stable periodic solutions, quasi-periodic solutions and strange attractors.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号