期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

On essential information in sequential decision processes

Eugene A. Feinberg 《Mathematical Methods of Operations Research》2005,62(3):399-410

This paper provides sufficient conditions when certain information about the past of a stochastic decision processes can be ignored by a controller. We illustrate the results with particular applications to queueing control, control of semi-Markov decision processes with iid sojourn times, and uniformization of continuous-time Markov decision processes. Mathematics Subject Classification (2000): Primary 60K25, Secondary 90C40 相似文献

2.

Continuous time control of Markov processes on an arbitrary state space: Average return criterion

Bharat T. Doshi 《Stochastic Processes and their Applications》1976,4(1):55-77

The paper deals with continuous time Markov decision processes on a fairly general state space. The economic criterion is the long-run average return. A set of conditions is shown to be sufficient for a constant g to be optimal average return and a stationary policy π¹ to be optimal. This condition is shown to be satisfied under appropriate assumptions on the optimal discounted return function. A policy improvement algorithm is proposed and its convergence to an optimal policy is proved. 相似文献

3.

Controlled jump processes

Stanley R. Pliska 《Stochastic Processes and their Applications》1975,3(3):259-282

Finite and infinite planning horizon Markov decision problems are formulated for a class of jump processes with general state and action spaces and controls which are measurable functions on the time axis taking values in an appropriate metrizable vector space. For the finite horizon problem, the maximum expected reward is the unique solution, which exists, of a certain differential equation and is a strongly continuous function in the space of upper semi-continuous functions. A necessary and sufficient condition is provided for an admissible control to be optimal, and a sufficient condition is provided for the existence of a measurable optimal policy. For the infinite horizon problem, the maximum expected total reward is the fixed point of a certain operator on the space of upper semi-continuous functions. A stationary policy is optimal over all measurable policies in the transient and discounted cases as well as, with certain added conditions, in the positive and negative cases. 相似文献

4.

Preference inference with general additive value models and holistic pair-wise statements

Remy Spliet Tommi Tervonen 《European Journal of Operational Research》2014

相似文献

5.

Optimal Control of Diffusions

Hector O. Fattorini 《Applied Mathematics and Optimization》2002,46(2):207-230

We prove a version of Pontryagin's maximum principle for time and norm optimal control of linear diffusion processes. This result includes both necessary and sufficient conditions and implies a ``concentration principle' for the optimal measure-valued controls. 相似文献

6.

New sufficient conditions for average optimality in continuous-time Markov decision processes

Liuer Ye Xianping Guo 《Mathematical Methods of Operations Research》2010,72(1):75-94

This paper is devoted to studying continuous-time Markov decision processes with general state and action spaces, under the long-run expected average reward criterion. The transition rates of the underlying continuous-time Markov processes are allowed to be unbounded, and the reward rates may have neither upper nor lower bounds. We provide new sufficient conditions for the existence of average optimal policies. Moreover, such sufficient conditions are imposed on the controlled process’ primitive data and thus they are directly verifiable. Finally, we apply our results to two new examples. 相似文献

7.

群体多目标决策联合有效解类的几何特性 总被引：2，自引：0，他引：2

胡毓达《运筹学学报》2001,5(3):21-28

群体多目标决策是群体决策和多目标决策的一个交叉研究领域,借助供选方案的有效数,文[1]引进了群体多目标决策问题的联合有效解类概念,并且建立了这些解类的K－T最优条件,本文研究这类解的几何特性,得到若干基本的必要条件一充分条件。相似文献

8.

Monotonicity in multidimensional Markov decision processes for the batch dispatch problem

Katerina Papadaki Warren B. Powell 《Operations Research Letters》2007,35(2):267-272

Structural properties of stochastic dynamic programs are essential to understanding the nature of the solutions and in deriving appropriate approximation techniques. We concentrate on a class of multidimensional Markov decision processes and derive sufficient conditions for the monotonicity of the value functions. We illustrate our result in the case of the multiproduct batch dispatch (MBD) problem. 相似文献

9.

An approach for an algorithmic solution of discrete optimal control problems and their game-theoretical extension

Dmitrii Lozovanu Stefan Pickl 《Central European Journal of Operations Research》2006,14(4):357-375

We consider time discrete systems which are described by a system of difference equations. The related discrete optimal control problems are introduced. Additionally, a gametheoretic extension is derived, which leads to general multicriteria decision problems. The characterization of their optimal behavior is studied. Given starting and final states define the decision process; applying dynamic programming techniques suitable optimal solutions can be gained. We generalize that approach to a special gametheoretic decision procedure on networks. We characterize Nash equilibria and present sufficient conditions for their existence. A constructive algorithm is derived. The sufficient conditions are exploited to get the algorithmic solution. Its complexity analysis is presented and at the end we conclude with an extension to the complementary case of Pareto optima.Dmitrii Lozovanu was Supported by BGP CRDF-MRDA MOM2-3049-CS-03. 相似文献

10.

Existence of optimal policy for time non-homogeneous discounted Markovian decision programming

Shizhen Guo Zeqing Dong 《应用数学学报(英文版)》1990,6(4):295-307

In this paper we discuss the discrete time non-homogeneous discounted Markovian decision programming, where the state space and all action sets are countable. Suppose that the optimum value function is finite. We give the necessary and sufficient conditions for the existence of an optimal policy. Suppose that the absolute mean of rewards is relatively bounded. We also give the necessary and sufficient conditions for the existence of an optimal policy. 相似文献

11.

Reduction of total-cost and average-cost MDPs with weakly continuous transition probabilities to discounted MDPs

Eugene A. Feinberg Jefferson Huang 《Operations Research Letters》2018,46(2):179-184

This note describes sufficient conditions under which total-cost and average-cost Markov decision processes (MDPs) with general state and action spaces, and with weakly continuous transition probabilities, can be reduced to discounted MDPs. For undiscounted problems, these reductions imply the validity of optimality equations and the existence of stationary optimal policies. The reductions also provide methods for computing optimal policies. The results are applied to a capacitated inventory control problem with fixed costs and lost sales. 相似文献

12.

非一致有界费用MDP的强平均最优性条件

肖晴初谭杭生《运筹学学报》2010,14(1):95-105

研究可数状态空间任意行动空间非一致性有界费用马氏决策过程(MDP)的强平均最优,给出了使得每个常用的平均最优策略也是强平均最优的条件,并实质性的推广了Cavazos-Cadena和Fernandez-Gaucheran(Math. Meth. Oper. Res., 1996, 43: 281-300)的主要结果. 相似文献

13.

一类单死过程的遍历性

张丽华张余辉《应用概率统计》2007,23(4):377-383

本文通过与生灭过程击中时矩的比较和随机可比的方法分别得出有限生单死过程各种遍历性的充分条件和必要条件. 文末, 讨论了一个例子的各种遍历性. 相似文献

14.

Dynamical behavior of a hybrid switching SIS epidemic model with vaccination and Lévy jumps

Qun Liu Tasawar Hayat Ahmed Alsaedi 《随机分析与应用》2019,37(3):388-411

In this paper, the dynamical behavior of a hybrid switching SIS epidemic model with vaccination and Lévy jumps is considered. Besides a standard geometric Brownian motion, another two driving processes are taken into account: a stationary Poisson point process and a continuous time finite-state Markov chain. Firstly, we establish sufficient conditions for persistence in the mean of the disease. Then we obtain sufficient conditions for extinction of the disease. In addition, we also establish sufficient conditions for the existence of positive recurrence of the solutions to the model by constructing a suitable stochastic Lyapunov function with regime switching. 相似文献

15.

Sample-path optimality and variance-maximization for Markov decision processes

Q. X. Zhu 《Mathematical Methods of Operations Research》2007,65(3):519-538

This paper studies both the average sample-path reward (ASPR) criterion and the limiting average variance criterion for denumerable discrete-time Markov decision processes. The rewards may have neither upper nor lower bounds. We give sufficient conditions on the system’s primitive data and under which we prove the existence of ASPR-optimal stationary policies and variance optimal policies. Our conditions are weaker than those in the previous literature. Moreover, our results are illustrated by a controlled queueing system. Research partially supported by the Natural Science Foundation of Guangdong Province (Grant No: 06025063) and the Natural Science Foundation of China (Grant No: 10626021). 相似文献

16.

On the optimality of a full-service policy for a queueing system with discounted costs

Shaler Stidham Jr. 《Mathematical Methods of Operations Research》2005,62(3):485-497

We provide weak sufficient conditions for a full-service policy to be optimal in a queueing control problem in which the service rate is a dynamic decision variable. In our model there are service costs and holding costs and the objective is to minimize the expected total discounted cost over an infinite horizon. We begin with a semi-Markov decision model for a single-server queue with exponentially distributed inter-arrival and service times. Then we present a general model with weak probabilistic assumptions and demonstrate that the full-service policy minimizes both finite-horizon and infinite-horizon total discounted cost on each sample path. 相似文献

17.

Markov programming by successive approximations with respect to weighted supremum norms

J Wessels 《Journal of Mathematical Analysis and Applications》1977,58(2):326-335

Markovian decision processes are considered in the situation of discrete time, countable state space, and general decision space. By introducing a Banach space with a weighted supremum norm, conditions are derived, which guarantee convergence of successive approximations to the value function. These conditions are weaker then those required by the usual supnorm approach. Several properties of the successive approximations are derived. 相似文献

18.

广生灭过程的遍历性及平稳分布 总被引：1，自引：0，他引：1

唐有荣刘再明《数学物理学报(A辑)》1998,18(1):25-32

文献[1]研究了广生灭过程的向上积分型随机泛函，得到了广生灭过程的若干数字特征以及常返的充要条件，该文讨论广生灭过程向下积分型随机泛函，给出了广生灭过程遍历的充要条件以及平均返回时间的计算公式，并在遍历的条件下求出了广生灭过程的平稳分布．相似文献

19.

Deciding nonconstructibility of 3-balls with spanning edges and interior vertices

Satoshi Kamei 《Discrete Mathematics》2007,307(24):3201-3206

Constructibility is a combinatorial property of simplicial complexes. In general, it requires a great deal of time to decide whether a simplicial complex is constructible or not. In this paper, we consider sufficient conditions for nonconstructibility of simplicial 3-balls to investigate efficient algorithms for the decision problem. 相似文献

20.

Analysis of a Lotka–Volterra food chain chemostat with converting time delays

Fengyan Wang Guoping Pang Shuwen Zhang 《Chaos, solitons, and fractals》2009,42(5):2786-2795

A model of the food chain chemostat involving predator, prey and growth-limiting nutrients is considered. The model incorporates two discrete time delays in order to describe the time involved in converting processes. The Lotka–Volterra type increasing functions are used to describe the species uptakes. In addition to showing that solutions with positive initial conditions are positive and bounded, we establish sufficient conditions for the (i) local stability and instability of the positive equilibrium and (ii) global stability of the non-negative equilibria. Numerical simulation suggests that the delays have both destabilizing and stabilizing effects, and the system can produce stable periodic solutions, quasi-periodic solutions and strange attractors. 相似文献