首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 843 毫秒
1.
We examine n-player stochastic games. These are dynamic games where a play evolves in stages along a finite set of states; at each stage players independently have to choose actions in the present state and these choices determine a stage payoff to each player as well as a transition to a new state where actions have to be chosen at the next stage. For each player the infinite sequence of his stage payoffs is evaluated by taking the limiting average. Normally stochastic games are examined under the condition of full monitoring, i.e. at any stage each player observes the present state and the actions chosen by all players. This paper is a first attempt towards understanding under what circumstances equilibria could exist in n-player stochastic games without full monitoring. We demonstrate the non-existence of -equilibria in n-player stochastic games, with respect to the average reward, when at each stage each player is able to observe the present state, his own action, his own payoff, and the payoffs of the other players, but is unable to observe the actions of them. For this purpose, we present and examine a counterexample with 3 players. If we further drop the assumption that the players can observe the payoffs of the others, then counterexamples already exist in games with only 2 players.  相似文献   

2.
We study stochastic differential games of jump diffusions driven by Brownian motions and compensated Poisson random measures, where one of the players can choose the stochastic control and the other player can decide when to stop the system. We prove a verification theorem for such games in terms of a Hamilton–Jacobi–Bellman variational inequality. The results are applied to study some specific examples, including optimal resource extraction in a worst-case scenario, and risk minimizing optimal portfolio and stopping.  相似文献   

3.
In this paper, we address various types of two-person stochastic games—both zero-sum and nonzero-sum, discounted and undiscounted. In particular, we address different aspects of stochastic games, namely: (1) When is a two-person stochastic game completely mixed? (2) Can we identify classes of undiscounted zero-sum stochastic games that have stationary optimal strategies? (3) When does a two-person stochastic game possess symmetric optimal/equilibrium strategies? Firstly, we provide some necessary and some sufficient conditions under which certain classes of discounted and undiscounted stochastic games are completely mixed. In particular, we show that, if a discounted zero-sum switching control stochastic game with symmetric payoff matrices has a completely mixed stationary optimal strategy, then the stochastic game is completely mixed if and only if the matrix games restricted to states are all completely mixed. Secondly, we identify certain classes of undiscounted zero-sum stochastic games that have stationary optima under specific conditions for individual payoff matrices and transition probabilities. Thirdly, we provide sufficient conditions for discounted as well as certain classes of undiscounted stochastic games to have symmetric optimal/equilibrium strategies—namely, transitions are symmetric and the payoff matrices of one player are the transpose of those of the other. We also provide a sufficient condition for the stochastic game to have a symmetric pure strategy equilibrium. We also provide examples to show the sharpness of our results.  相似文献   

4.
In this paper, we consider discrete-time \(N\) -person constrained stochastic games with discounted cost criteria. The state space is denumerable and the action space is a Borel set, while the cost functions are admitted to be unbounded from below and above. Under suitable conditions weaker than those in (Alvarez-Mena and Hernández-Lerma, Math Methods Oper Res 63:261–285, 2006) for bounded cost functions, we also show the existence of a Nash equilibrium for the constrained games by introducing two approximations. The first one, which is as in (Alvarez-Mena and Hernández-Lerma, Math Methods Oper Res 63:261–285, 2006), is to construct a sequence of finite games to approximate a (constrained) auxiliary game with an initial distribution that is concentrated on a finite set. However, without hypotheses of bounded costs as in (Alvarez-Mena and Hernández-Lerma, Math Methods Oper Res 63:261–285, 2006), we also establish the existence of a Nash equilibrium for the auxiliary game with unbounded costs by developing more shaper error bounds of the approximation. The second one, which is new, is to construct a sequence of the auxiliary-type games above and prove that the limit of the sequence of Nash equilibria for the auxiliary-type games is a Nash equilibrium for the original constrained games. Our results are illustrated by a controlled queueing system.  相似文献   

5.
We consider two-person zero-sum games of stopping: two players sequentially observe a stochastic process with infinite time horizon. Player I selects a stopping time and player II picks the distribution of the process. The pay-off is given by the expected value of the stopped process. Results of Irle (1990) on existence of value and equivalence of randomization for such games with finite time horizon, where the set of strategies for player II is dominated in the measure-theoretical sense, are extended to the infinite time case. Furthermore we treat such games when the set of strategies for player II is not dominated. A counterexample shows that even in the finite time case such games may not have a value. Then a sufficient condition for the existence of value is given which applies to prophet-type games.  相似文献   

6.
We treat non-cooperative stochastic games with countable state space and with finitely many players each having finitely many moves available in a given state. As a function of the current state and move vector, each player incurs a nonnegative cost. Assumptions are given for the expected discounted cost game to have a Nash equilibrium randomized stationary strategy. These conditions hold for bounded costs, thereby generalizing Parthasarathy (1973) and Federgruen (1978). Assumptions are given for the long-run average expected cost game to have a Nash equilibrium randomized stationary strategy, under which each player has constant average cost. A flow control example illustrates the results. This paper complements the treatment of the zero-sum case in Sennott (1993a).  相似文献   

7.
Zero-sum stochastic games with countable state space and with finitely many moves available to each player in a given state are treated. As a function of the current state and the moves chosen, player I incurs a nonnegative cost and player II receives this as a reward. For both the discounted and average cost cases, assumptions are given for the game to have a finite value and for the existence of an optimal randomized stationary strategy pair. In the average cost case, the assumptions generalize those given in Sennott (1993) for the case of a Markov decision chain. Theorems of Hoffman and Karp (1966) and Nowak (1992) are obtained as corollaries. Sufficient conditions are given for the assumptions to hold. A flow control example illustrates the results.  相似文献   

8.
In this paper, we consider the continuous-time nonzero-sum stochastic games under the constrained average criteria. The state space is denumerable and the action space of each player is a general Polish space. The transition rates, reward and cost functions are allowed to be unbounded. The main hypotheses in this paper include the standard drift conditions, continuity-compactness condition and some ergodicity assumptions. By applying the vanishing discount method, we obtain the existence of stationary constrained average Nash equilibria.  相似文献   

9.
Given a non-zero sum discounted stochastic game with finitely many states and actions one can form a bimatrix game whose pure strategies are the pure stationary strategies of the players and whose penalty payoffs consist of the total discounted costs over all states at any pure stationary pair. It is shown that any Nash equilibrium point of this bimatrix game can be used to find a Nash equilibrium point of the stochastic game whenever the law of motion is controlled by one player. The theorem is extended to undiscounted stochastic games with irreducible transitions when the law of motion is controlled by one player. Examples are worked out to illustrate the algorithm proposed.The work of this author was supported in part by the NSF grants DMS-9024408 and DMS 8802260.  相似文献   

10.
In stochastic games with finite state and action spaces, we examine existence of equilibria where player 1 uses the limiting average reward and player 2 a discounted reward for the evaluations of the respective payoff sequences. By the nature of these rewards, the far future determines player 1's reward, while player 2 is rather interested in the near future. This gives rise to a natural cooperation between the players along the course of the play. First we show the existence of stationary ε-equilibria, for all ε>0, in these games. However, besides these stationary ε-equilibria, there also exist ε-equilibria, in terms of only slightly more complex ultimately stationary strategies, which are rather in the spirit of these games because, after a large stage when the discounted game is not interesting any longer, the players cooperate to guarantee the highest feasible reward to player 1. Moreover, we analyze an interesting example demonstrating that 0-equilibria do not necessarily exist in these games, not even in terms of history dependent strategies. Finally, we examine special classes of stochastic games with specific conditions on the transition and payoff structures. Several examples are given to clarify all these issues.  相似文献   

11.
A class of N-person stochastic games of resource extraction with discounted payoffs in discrete time is considered. It is assumed that transition probabilities have special additive structure. It is shown that the Nash equilibria and corresponding payoffs in finite horizon games converge as horizon goes to infinity. This implies existence of stationary Nash equilibria in the infinite horizon case. In addition the algorithm for finding Nash equilibria in infinite horizon games is discussed  相似文献   

12.
Eitan Altman 《Queueing Systems》1996,23(1-4):259-279
The purpose of this paper is to investigate situations of non-cooperative dynamic control of queueing systems by two agents, having different objectives. The main part of the paper is devoted to analyzing a problem of an admission and a service (vacation) control. The admission controller has to decide whether to allow arrivals to occur. Once the queue empties, the server goes on vacation, and controls the vacations duration (according to the state and past history of the queue). The immediate costs for each controller are increasing in the number of customers, but no convexity assumptions are made. The controllers are shown to have a stationary equilibrium policy pair, for which each controller uses a stationary threshold type policy with randomization in at most one state. We then investigate a problem of a non-zero sum stochastic game between a router into several queues, and a second controller that allocates some extra service capacity to one of the queues. We establish the equilibrium of a policy pair for which the router uses the intuitive Join the shortest queue policy.  相似文献   

13.
In this paper, the effect on values and optimal strategies of perturbations of game parameters (payoff function, transition probability function, and discount factor) is studied for the class of zero-sum games in normal form and for the class of stationary, discounted, two-person, zero-sum stochastic games.A main result is that, under certain conditions, the value depends on these parameters in a pointwise Lipschitz continuous way and that the sets of -optimal strategies for both players are upper semicontinuous multifunctions of the game parameters.Extensions to general-sum games and nonstationary stochastic games are also indicated.  相似文献   

14.
In this paper, we consider positive stochastic games, when the state and action spaces are all infinite. We prove that, under certain conditions, the positive stochastic game has a value and that the maximizing player has an -optimal stationary strategy and the minimizing player has an optimal stationary strategy.The authors are grateful to Professor David Blackwell and the referee for some useful comments.  相似文献   

15.
A queueing model is considered in which a controller can increase the service rate. There is a holding cost represented by functionh and the service cost proportional to the increased rate with coefficientl. The objective is to minimize the total expected discounted cost.Whenh andl are small and the system operates in heavy traffic, the control problem can be approximated by a singular stochastic control problem for the Brownian motion, namely, the so-called reflected follower problem. The optimal policy in this problem is characterized by a single numberz * so that the optimal process is a reflected diffusion in [0,z *]. To obtainz * one needs to solve a free boundary problem for the second order ordinary differential equation. For the original problem the policy which increases to the maximum the service rate when the normalized queue-length exceedsz * is approximately optimal.  相似文献   

16.
This paper discusses the problem regarding the existence of optimal or nearly optimal stationary strategies for a player engaged in a nonleavable stochastic game. It is known that, for these games, player I need not have an -optimal stationary strategy even when the state space of the game is finite. On the contrary, we show that uniformly -optimal stationary strategies are available to player II for nonleavable stochastic games with finite state space. Our methods will also yield sufficient conditions for the existence of optimal and -optimal stationary strategies for player II for games with countably infinite state space. With the purpose of introducing and explaining the main results of the paper, special consideration is given to a particular class of nonleavable games whose utility is equal to the indicator of a subset of the state space of the game.  相似文献   

17.
In this paper, we deal with two-person zero-sum stochastic games for discrete-time Markov processes. The optimality criterion to be studied is the discounted payoff criterion during a first passage time to some target set, where the discount factor is state-dependent. The state and action spaces are all Borel spaces, and the payoff functions are allowed to be unbounded. Under the suitable conditions, we first establish the optimality equation. Then, using dynamic programming techniques, we obtain the existence of the value of the game and a pair of optimal stationary policies. Moreover, we present the exponential convergence of the value iteration and a ‘martingale characterization’ of a pair of optimal policies. Finally, we illustrate the applications of our main results with an inventory system.  相似文献   

18.
This paper considers discounted noncooperative stochastic games with uncountable state space and compact metric action spaces. We assume that the transition law is absolutely continuous with respect to some probability measure defined on the state space. We prove, under certain additional continuity and integrability conditions, that such games have -equilibrium stationary strategies for each >0. To prove this fact, we provide a method for approximating the original game by a sequence of finite or countable state games. The main result of this paper answers partially a question raised by Parthasarathy in Ref. 1.  相似文献   

19.
We show that obtainable equilibria of a multi-period nonatomic game can be used by players in its large finite counterparts to achieve near-equilibrium payoffs. Such equilibria in the form of random state-to-action rules are parsimonious in form and easy to execute, as they are both oblivious of past history and blind to other players’ present states. Our transient results can be extended to a stationary case, where the finite multi-period games are special discounted stochastic games. In both nonatomic and finite games, players’ states influence their payoffs along with actions they take; also, the random evolution of one particular player’s state is driven by all players’ states as well as actions. The finite games can model diverse situations such as dynamic price competition. But they are notoriously difficult to analyze. Our results thus suggest ways to tackle these problems approximately.  相似文献   

20.
We consider a discrete-time constrained Markov decision process under the discounted cost optimality criterion. The state and action spaces are assumed to be Borel spaces, while the cost and constraint functions might be unbounded. We are interested in approximating numerically the optimal discounted constrained cost. To this end, we suppose that the transition kernel of the Markov decision process is absolutely continuous with respect to some probability measure μ  . Then, by solving the linear programming formulation of a constrained control problem related to the empirical probability measure μnμn of μ, we obtain the corresponding approximation of the optimal constrained cost. We derive a concentration inequality which gives bounds on the probability that the estimation error is larger than some given constant. This bound is shown to decrease exponentially in n. Our theoretical results are illustrated with a numerical application based on a stochastic version of the Beverton–Holt population model.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号