期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

On the optimality equation for zero-sum ergodic stochastic games

Anna Jaśkiewicz Andrzej S. Nowak 《Mathematical Methods of Operations Research》2001,54(2):291-301

相似文献

2.

Optimal strategies in a class of zero-sum ergodic stochastic games

Andrzej S. Nowak 《Mathematical Methods of Operations Research》1999,50(3):399-419

相似文献

3.

Zero-sum stochastic games with unbounded costs: Discounted and average cost cases

Linn I. Sennott 《Mathematical Methods of Operations Research》1994,39(2):209-225

Zero-sum stochastic games with countable state space and with finitely many moves available to each player in a given state are treated. As a function of the current state and the moves chosen, player I incurs a nonnegative cost and player II receives this as a reward. For both the discounted and average cost cases, assumptions are given for the game to have a finite value and for the existence of an optimal randomized stationary strategy pair. In the average cost case, the assumptions generalize those given in Sennott (1993) for the case of a Markov decision chain. Theorems of Hoffman and Karp (1966) and Nowak (1992) are obtained as corollaries. Sufficient conditions are given for the assumptions to hold. A flow control example illustrates the results. 相似文献

4.

Optimality in different strategy classes in zero-sum stochastic games

J. Flesch F. Thuijsman O. J. Vrieze 《Mathematical Methods of Operations Research》2002,56(2):315-322

相似文献

5.

New optimality conditions for average-payoff continuous-time Markov games in Polish spaces

HERNNDEZ-LERMA Onsimo 《中国科学数学(英文版)》2011,(4)

This paper concerns two-person zero-sum games for a class of average-payoff continuous-time Markov processes in Polish spaces.The underlying processes are determined by transition rates that are allowed to be unbounded,and the payoff function may have neither upper nor lower bounds.We use two optimality inequalities to replace the so-called optimality equation in the previous literature.Under more general conditions,these optimality inequalities yield the existence of the value of the game and of a pair of ... 相似文献

6.

On stochastic games with additive reward and transition structure

T. E. S. Raghavan S. H. Tijs O. J. Vrieze 《Journal of Optimization Theory and Applications》1985,47(4):451-464

In this paper, we introduce a new class of two-person stochastic games with nice properties. For games in this class, the payoffs as well as the transitions in each state consist of a part which depends only on the action of the first player and a part dependent only on the action of the second player.For the zero-sum games in this class, we prove that the orderfield property holds in the infinite-horizon case and that there exist optimal pure stationary strategies for the discounted as well as the undiscounted payoff criterion. For both criteria also, finite algorithms are given to solve the game. An example shows that, for nonzero sum games in this class, there are not necessarily pure stationary equilibria. But, if such a game possesses a stationary equilibrium point, then there also exists a stationary equilibrium point which uses in each state at most two pure actions for each player. 相似文献

7.

Nonzero-sum stochastic games with unbounded costs: Discounted and average cost cases

Linn I. Sennott 《Mathematical Methods of Operations Research》1994,40(2):145-162

We treat non-cooperative stochastic games with countable state space and with finitely many players each having finitely many moves available in a given state. As a function of the current state and move vector, each player incurs a nonnegative cost. Assumptions are given for the expected discounted cost game to have a Nash equilibrium randomized stationary strategy. These conditions hold for bounded costs, thereby generalizing Parthasarathy (1973) and Federgruen (1978). Assumptions are given for the long-run average expected cost game to have a Nash equilibrium randomized stationary strategy, under which each player has constant average cost. A flow control example illustrates the results. This paper complements the treatment of the zero-sum case in Sennott (1993a). 相似文献

8.

Nonzero-sum semi-Markov games with the expected average payoffs

Andrzej S. Nowak Anna Jaśkiewicz 《Mathematical Methods of Operations Research》2005,62(1):23-40

Nonzero-sum ergodic semi-Markov games with Borel state spaces are studied. An equilibrium theorem is proved in the class of correlated stationary strategies using public randomization. Under some additivity assumption concerning the transition probabilities stationary Nash equilibria are also shown to exist.Received: October 2004 / Revised: January 2005 相似文献

9.

Existence of equilibrium stationary strategies in discounted noncooperative stochastic games with uncountable state space

A. S. Nowak 《Journal of Optimization Theory and Applications》1985,45(4):591-602

This paper considers discounted noncooperative stochastic games with uncountable state space and compact metric action spaces. We assume that the transition law is absolutely continuous with respect to some probability measure defined on the state space. We prove, under certain additional continuity and integrability conditions, that such games have -equilibrium stationary strategies for each >0. To prove this fact, we provide a method for approximating the original game by a sequence of finite or countable state games. The main result of this paper answers partially a question raised by Parthasarathy in Ref. 1. 相似文献

10.

Zero-sum Markov games and worst-case optimal control of queueing systems

Eitan Altman Arie Hordijk 《Queueing Systems》1995,21(3-4):415-447

Zero-sum stochastic games model situations where two persons, called players, control some dynamic system, and both have opposite objectives. One player wishes typically to minimize a cost which has to be paid to the other player. Such a game may also be used to model problems with a single controller who has only partial information on the system: the dynamic of the system may depend on some parameter that is unknown to the controller, and may vary in time in an unpredictable way. A worst-case criterion may be considered, where the unknown parameter is assumed to be chosen by nature (called player 1), and the objective of the controller (player 2) is then to design a policy that guarantees the best performance under worst-case behaviour of nature. The purpose of this paper is to present a survey of stochastic games in queues, where both tools and applications are considered. The first part is devoted to the tools. We present some existing tools for solving finite horizon and infinite horizon discounted Markov games with unbounded cost, and develop new ones that are typically applicable in queueing problems. We then present some new tools and theory of expected average cost stochastic games with unbounded cost. In the second part of the paper we present a survey on existing results on worst-case control of queues, and illustrate the structural properties of best policies of the controller, worst-case policies of nature, and of the value function. Using the theory developed in the first part of the paper, we extend some of the above results, which were known to hold for finite horizon costs or for the discounted cost, to the expected average cost. 相似文献

11.

Existence of stationary equilibrium strategies in non-zero sum discounted stochastic games with uncountable state space and state-independent transitions

T. Parthasarathy S. Sinha 《International Journal of Game Theory》1989,18(2):189-194

Non-zero sum discounted stochastic games with uncountable state space and state in-dependent transitions have stationary equilibrium strategies. 相似文献

12.

Stationary equilibria in cyclic games: Search and structure

V. N. Lebedev 《Mathematical Notes》2000,67(6):771-777

The existence of optimal stationary strategies for a cyclic game played on the vertices of a bipartite graph up to the first cycle with the payoff of one player to the other equaling the sum of the maximal and minimal local payoffs on this cycle is proved. This result implies that the problem belongs to the class NP ∩ co-NP; -a polynomial algorithm that yields optimal strategies for ergodic extensions of matrix games is given. Translated fromMatematicheskie Zametki, Vol. 67, No. 6, pp. 913–921, June, 2000. 相似文献

13.

Unbounded cost Markov decision processes with limsup and liminf average criteria: new conditions

Quanxin Zhu Xianping Guo Yonglong Dai 《Mathematical Methods of Operations Research》2005,61(3):469-482

相似文献

14.

Existence of value in differential games with fixed time duration

L. S. Zaremba 《Journal of Optimization Theory and Applications》1982,38(4):581-598

A differential game of prescribed duration with general-type phase constraints is investigated. The existence of a value in the Varaiya-Lin sense and an optimal strategy for one of the players is obtained under assumptions ensuring that the sets of all admissible trajectories for the two players are compact in the Banach space of all continuous functions. These results are next widened on more general games, examined earlier by Varaiya.The author wishes to express his thanks to an anonymous reviewer for his many valuable comments. 相似文献

15.

A New Condition and Approach for Zero-Sum Stochastic Games with Average Payoffs

Xianping Guo Jie Yang 《随机分析与应用》2013,31(3):537-561

Abstract

This article deals with discrete-time two-person zero-sum stochastic games with Borel state and action spaces. The optimality criterion to be studied is the long-run expected average payoff criterion, and the (immediate) payoff function may have neither upper nor lower bounds. We first replace the optimality equation widely used in the previous literature with two so-called optimality inequalities, and give a new set of conditions for the existence of solutions to the optimality inequalities. Then, from the optimality inequalities we ensure the existence of a pair of average optimal stationary strategies. Our new condition is slightly weaker than those in the previous literature, and as a byproduct some interesting results such as the convergence of a value iteration scheme to the value of the discounted payoff game is obtained. Finally, we first apply the main results in this article to generalized inventory systems, and then further provide an example of controlled population processes for which all of our conditions are satisfied, while some of conditions in some of previous literature fail to hold. 相似文献

16.

Denumerable state semi-Markov decision processes with unbounded costs,average cost criterion

A. Federgruen A. Hordijk H.C. Tijms 《Stochastic Processes and their Applications》1979,9(2):223-235

This paper establishes a rather complete optimality theory for the average cost semi-Markov decision model with a denumerable state space, compact metric action sets and unbounded one-step costs for the case where the underlying Markov chains have a single ergotic set. Under a condition which, roughly speaking, requires the existence of a finite set such that the supremum over all stationary policies of the expected time and the total expected absolute cost incurred until the first return to this set are finite for any starting state, we shall verify the existence of a finite solution to the average costs optimality equation and the existence of an average cost optimal stationary policy. 相似文献

17.

Complex differential games of pursuit-evasion type with state constraints,part 1: Necessary conditions for optimal open-loop strategies

Breitner M. H. Pesch H. J. Grimm W. 《Journal of Optimization Theory and Applications》1993,78(3):419-441

Complex pursuit-evasion games with state variable inequality constraints are investigated. Necessary conditions of the first and the second order for optimal trajectories are developed, which enable the calculation of optimal open-loop strategies. The necessary conditions on singular surfaces induced by state constraints and non-smooth data are discussed in detail. These conditions lead to multi-point boundary-value problems which can be solved very efficiently and very accurately by the multiple shooting method. A realistically modelled pursuit-evasion problem for one air-to-air missile versus one high performance aircraft in a vertical plane serves as an example. For this pursuit-evasion game, the barrier surface is investigated, which determines the firing range of the missile. The numerical method for solving this problem and extensive numerical results will be presented and discussed in Part 2 of this paper; see Ref. 1.This paper is dedicated to the memory of Professor John V. Breakwell.The authors would like to express their sincere and grateful appreciation to Professors R. Bulirsch and K. H. Well for their encouraging interest in this work. 相似文献