共查询到20条相似文献,搜索用时 0 毫秒
1.
We study some games of perfect information in which two players move alternately along the edges of a finite directed graph with weights attached to its edges. One of them wants to maximize and the other to minimize some means of the encountered weights. 相似文献
2.
We give a policy iteration algorithm to solve zero-sum stochastic games with finite state and action spaces and perfect information, when the value is defined in terms of the mean payoff per turn. This algorithm does not require any irreducibility assumption on the Markov chains determined by the strategies of the players. It is based on a discrete nonlinear analogue of the notion of reduction of a super-harmonic function. To cite this article: J. Cochet-Terrasson, S. Gaubert, C. R. Acad. Sci. Paris, Ser. I 343 (2006). 相似文献
3.
Henrik Björklund 《Discrete Applied Mathematics》2007,155(2):210-229
We suggest the first strongly subexponential and purely combinatorial algorithm for solving the mean payoff games problem. It is based on iteratively improving the longest shortest distances to a sink in a possibly cyclic directed graph.We identify a new “controlled” version of the shortest paths problem. By selecting exactly one outgoing edge in each of the controlled vertices we want to make the shortest distances from all vertices to the unique sink as long as possible. The decision version of the problem (whether the shortest distance from a given vertex can be made bigger than a given bound?) belongs to the complexity class NP∩CONP. Mean payoff games are easily reducible to this problem. We suggest an algorithm for computing longest shortest paths. Player MAX selects a strategy (one edge from each controlled vertex) and player MIN responds by evaluating shortest paths to the sink in the remaining graph. Then MAX locally changes choices in controlled vertices looking at attractive switches that seem to increase shortest paths lengths (under the current evaluation). We show that this is a monotonic strategy improvement, and every locally optimal strategy is globally optimal. This allows us to construct a randomized algorithm of complexity , which is simultaneously pseudopolynomial (W is the maximal absolute edge weight) and subexponential in the number of vertices n. All previous algorithms for mean payoff games were either exponential or pseudopolynomial (which is purely exponential for exponentially large edge weights). 相似文献
4.
Endre Boros Khaled Elbassioni Vladimir Gurvich Kazuhisa Makino 《Optimization Letters》2017,11(8):1499-1512
We consider two-person zero-sum stochastic mean payoff games with perfect information, or BWR-games, given by a digraph \(G = (V, E)\), with local rewards \(r{:}\,E \rightarrow \mathbb {Z}\), and three types of positions: black \(V_B\), white \(V_W\), and random \(V_R\) forming a partition of V. It is a long-standing open question whether a polynomial time algorithm for BWR-games exists, even when \(|V_R|=0\). In fact, a pseudo-polynomial algorithm for BWR-games would already imply their polynomial solvability. In this short note, we show that BWR-games can be solved via convex programming in pseudo-polynomial time if the number of random positions is a constant. 相似文献
5.
We discuss the tropical analogues of several basic questions of convex duality. In particular, the polar of a tropical polyhedral cone represents the set of linear inequalities that its elements satisfy. We characterize the extreme rays of the polar in terms of certain minimal set covers which may be thought of as weighted generalizations of minimal transversals in hypergraphs. We also give a tropical analogue of Farkas lemma, which allows one to check whether a linear inequality is implied by a finite family of linear inequalities. Here, the certificate is a strategy of a mean payoff game. We discuss examples, showing that the number of extreme rays of the polar of the tropical cyclic polyhedral cone is polynomially bounded, and that there is no unique minimal system of inequalities defining a given tropical polyhedral cone. 相似文献
6.
A. Sprzeuzkouski 《Journal of Optimization Theory and Applications》1976,18(1):103-118
It is shown that a saddle-point solution exists in a two-person, zero-sum game whose payoff is given by a matrix which is not completely defined. On the other hand, we show that such games do not always have a value, so that a saddle-point solution is not necessarily an optimal solution.This work was supported by the Centre d'Etudes Atomiques, Saclay, France. 相似文献
7.
V. I. Ukhobotov D. V. Gushchin 《Proceedings of the Steklov Institute of Mathematics》2011,275(1):178-185
Single-type differential games are considered. The problem of taking a phase point to a disk of a fixed radius at a given time is studied. The payoff is the integral of a convex function depending on the norm of the first player’s control. 相似文献
8.
Mrinal K. Ghosh Anindya Goswami 《Journal of Mathematical Analysis and Applications》2008,345(1):26-39
We study a zero-sum partially observed semi-Markov game with average payoff on a countable state space. Under certain ergodicity conditions we show that a saddle point equilibrium exists. We achieve this by solving the corresponding average cost optimality equation using a span contraction method. The average value is shown to be the unique zero of a Lipschitz continuous function. A value iteration scheme is developed to compute the value. 相似文献
9.
We study stochastic games with countable state space, compact action spaces, and limiting average payoff. ForN-person games, the existence of an equilibrium in stationary strategies is established under a certain Liapunov stability condition. For two-person zero-sum games, the existence of a value and optimal strategies for both players are established under the same stability condition.The authors wish to thank Prof. T. Parthasarathy for pointing out an error in an earlier version of this paper. M. K. Ghosh wishes to thank Prof. A. Arapostathis and Prof. S. I. Marcus for their hospitality and support. 相似文献
10.
A. Sprzeuzkouski 《Journal of Optimization Theory and Applications》1977,21(2):225-233
We study two-person, zero-sum matrix games whose payoffs are not defined for every pair of strategies. A necessary and sufficient condition for these games to possess a value is given, and we show that the value can be approximated by using universally playable strategies.This work was supported by the Centre d'Etudes Nucléaires, Saclay, France. 相似文献
11.
《Journal of Applied Mathematics and Mechanics》1998,62(4):555-563
A class of antagonistic linear differential games (DGs) in a fixed time interval with ellipsoidal payoff functional is considered. This class of DGs includes problems which assume both rigid constraints on the players' controls and requirements to minimize control expenses. Other known classes of differential games, such as linear DGs with a quadratic performance index and linear DGs with ellipsoidal terminal sets and admissible sets of controls for the players, considered in Kurzhanskii's ellipsoidal technique, are limiting cases of DGs of this class. The concept of a u-strategic function, which expresses the property of u-stability for ellipsoidal functions, is introduced. An effective algorithm is presented for computing a u-strategic function, based on Kurzhanskii's ellipsoidal technique. The main result of this paper is that a guaranteed positional strategy for player u is defined by a certain explicit formula in terms of a u-strategic function. The proof of this result is based on a viability theorem for differential equations. 相似文献
12.
The celebrated von Neumann minimax theorem is a fundamental theorem in two-person zero-sum games. In this paper, we present a generalization of the von Neumann minimax theorem, called robust von Neumann minimax theorem, in the face of data uncertainty in the payoff matrix via robust optimization approach. We establish that the robust von Neumann minimax theorem is guaranteed for various classes of bounded uncertainties, including the matrix 1-norm uncertainty, the rank-1 uncertainty and the columnwise affine parameter uncertainty. 相似文献
13.
E. P. Cunningham 《Journal of Optimization Theory and Applications》1971,7(4):258-286
This paper deals with a differential game or optimal control problem in which the payoff is the maximum (or minimum), during play, of some scalar functionK of the statex. This unconventional payoff has many practical applications. By defining certain auxiliary games for a significant class of problems, one can show how to solve the general case where more than one maximum ofk(t)=K[x(t)] occurs under optimal play. For a subclass of such problems, it is found thatclosed optimal solutions can exist on certain surfaces in the playing space. As the playing interval becomes indefinitely long, the open optimal trajectories converge to (or diverge from) such surfaces. In particular, for two-dimensional problems of this subclass, the closed optimal trajectories are periodic and called periodic barriers. They are analogous to limit cycles in uncontrolled nonlinear systems.The writer is indebted to Dr. R. Isaacs for many stimulating conversations in connection with this study. 相似文献
14.
This paper attempts to study two-person nonzero-sum games for denumerable continuous-time Markov chains determined by transition rates,with an expected average criterion.The transition rates are allowed to be unbounded,and the payoff functions may be unbounded from above and from below.We give suitable conditions under which the existence of a Nash equilibrium is ensured.More precisely,using the socalled "vanishing discount" approach,a Nash equilibrium for the average criterion is obtained as a limit point of a sequence of equilibrium strategies for the discounted criterion as the discount factors tend to zero.Our results are illustrated with a birth-and-death game. 相似文献
15.
A mean value for games with communication structures 总被引:1,自引:0,他引:1
Gérard Hamiache 《International Journal of Game Theory》2004,32(4):533-544
The mean value is a new extension of the Shapley value for games with communication structure representable by a simple graph; only pairwise meetings can occur, although some of them might not be permitted. The new value is characterized by a set of axioms of which the one with the most far-reaching effect is an associated consistency property already used in various contexts. The mean value of an n-player unanimity game is the arithmetic average of the mean values of (n–1)-player unanimity games with connected support, which means games in which the deleted players are not articulation point of the considered graph.I wish to thank the anonymous referees for their helpful remarks. The usual disclaimer applies.Received: April 2002/Accepted: February 2004 相似文献
16.
Olivier Guant 《Journal de Mathématiques Pures et Appliquées》2009,92(3):276-294
In this article, we present a reference case of mean field games. This case can be seen as a reference for two main reasons. First, the case is simple enough to allow for explicit resolution: Bellman functions are quadratic, stationary measures are normal and stability can be dealt with explicitly using Hermite polynomials. Second, in spite of its simplicity, the case is rich enough in terms of mathematics to be generalized and to inspire the study of more complex models that may not be as tractable as this one. 相似文献
17.
P. Jameson Graber Alpár R. Mészáros 《Annales de l'Institut Henri Poincaré (C) Analyse Non Linéaire》2018,35(6):1557-1576
In this paper we obtain Sobolev estimates for weak solutions of first order variational Mean Field Game systems with coupling terms that are local functions of the density variable. Under some coercivity conditions on the coupling, we obtain first order Sobolev estimates for the density variable, while under similar coercivity conditions on the Hamiltonian we obtain second order Sobolev estimates for the value function. These results are valid both for stationary and time-dependent problems. In the latter case the estimates are fully global in time, thus we resolve a question which was left open in [23]. Our methods apply to a large class of Hamiltonians and coupling functions. 相似文献
18.
In this paper, the effect on values and optimal strategies of perturbations of game parameters (payoff function, transition probability function, and discount factor) is studied for the class of zero-sum games in normal form and for the class of stationary, discounted, two-person, zero-sum stochastic games.A main result is that, under certain conditions, the value depends on these parameters in a pointwise Lipschitz continuous way and that the sets of -optimal strategies for both players are upper semicontinuous multifunctions of the game parameters.Extensions to general-sum games and nonstationary stochastic games are also indicated. 相似文献
19.
For a zero-sum differential game, an algorithm is proposed for computing the value of the game and constructing optimal control
strategies with the help of stepwise minimax. It is assumed that the dynamics can be nonlinear and the cost functional of
the game is the sum of an integral term and a terminal payoff function that satisfies the Lipschitz condition but can be neither
convex nor concave. The players’ controls are chosen from given sets that are generally time-dependent and unbounded. An error
estimate for the algorithm is obtained depending on the number of partition points in the time interval and on the fineness
of the spatial triangulation. Numerical results for an illustrative example are presented. 相似文献
20.
We provide a novel characterization of the feasible payoff set of a general two-player repeated game with unequal discounting. In particular, we show that generically the Pareto frontier shifts outwards and the feasible payoff set expands in the sense of set inclusion, as the time horizon increases. This result reinforces and refines the insight in Lehrer and Pauzner (1999) by showing that a longer horizon enables the players to conduct intertemporal trade in a more flexible fashion. 相似文献