期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Algorithms for uniform optimal strategies in two-player zero-sum stochastic games with perfect information

Konstantin Avrachenkov 《Operations Research Letters》2012,40(1):56-60

We deal with zero-sum two-player stochastic games with perfect information. We propose two algorithms to find the uniform optimal strategies and one method to compute the optimality range of discount factors. We prove the convergence in finite time for one algorithm. The uniform optimal strategies are also optimal for the long run average criterion and, in transient games, for the undiscounted criterion as well. 相似文献

2.

Partially observed semi-Markov zero-sum games with average payoff

Mrinal K. Ghosh Anindya Goswami 《Journal of Mathematical Analysis and Applications》2008,345(1):26-39

We study a zero-sum partially observed semi-Markov game with average payoff on a countable state space. Under certain ergodicity conditions we show that a saddle point equilibrium exists. We achieve this by solving the corresponding average cost optimality equation using a span contraction method. The average value is shown to be the unique zero of a Lipschitz continuous function. A value iteration scheme is developed to compute the value. 相似文献

3.

Subgame-perfection in free transition games

J. Flesch J. Kuipers G. Schoenmakers K. Vrieze 《European Journal of Operational Research》2013

We prove the existence of a subgame-perfect ε-equilibrium, for every ε > 0, in a class of multi-player games with perfect information, which we call free transition games. The novelty is that a non-trivial class of perfect information games is solved for subgame-perfection, with multiple non-terminating actions, in which the payoff structure is generally not (upper or lower) semi-continuous. Due to the lack of semi-continuity, there is no general rule of comparison between the payoffs that a player can obtain by deviating a large but finite number of times or, respectively, infinitely many times. We introduce new techniques to overcome this difficulty. 相似文献

4.

Pure subgame-perfect equilibria in free transition games

J. Kuipers J. Flesch G. Schoenmakers K. Vrieze 《European Journal of Operational Research》2009

We consider a class of stochastic games, where each state is identified with a player. At any moment during play, one of the players is called active. The active player can terminate the game, or he can announce any player, who then becomes the active player. There is a non-negative payoff for each player upon termination of the game, which depends only on the player who decided to terminate. We give a combinatorial proof of the existence of subgame-perfect equilibria in pure strategies for the games in our class. 相似文献

5.

Decision making with imprecise probabilities and utilities by means of statistical preference and stochastic dominance

Ignacio Montes Enrique Miranda Susana Montes 《European Journal of Operational Research》2014

A problem of decision making under uncertainty in which the choice must be made between two sets of alternatives instead of two single ones is considered. A number of choice rules are proposed and their main properties are investigated, focusing particularly on the generalizations of stochastic dominance and statistical preference. The particular cases where imprecision is present in the utilities or in the beliefs associated to two alternatives are considered. 相似文献

6.

Generic uniqueness of the bias vector of finite zero-sum stochastic games with perfect information

Marianne Akian Stéphane Gaubert Antoine Hochart 《Journal of Mathematical Analysis and Applications》2018,457(2):1038-1064

Mean-payoff zero-sum stochastic games can be studied by means of a nonlinear spectral problem. When the state space is finite, the latter consists in finding an eigenpair

(u, λ)

solution of

T (u) = λ e + u

, where

T : R^{n} \to R^{n}

is the Shapley (or dynamic programming) operator, λ is a scalar, e is the unit vector, and

u \in R^{n}

. The scalar λ yields the mean payoff per time unit, and the vector u, called bias, allows one to determine optimal stationary strategies in the mean-payoff game. The existence of the eigenpair

(u, λ)

is generally related to ergodicity conditions. A basic issue is to understand for which classes of games the bias vector is unique (up to an additive constant). In this paper, we consider perfect-information zero-sum stochastic games with finite state and action spaces, thinking of the transition payments as variable parameters, transition probabilities being fixed. We show that the bias vector, thought of as a function of the transition payments, is generically unique (up to an additive constant). The proof uses techniques of nonlinear Perron–Frobenius theory. As an application of our results, we obtain an explicit perturbation scheme allowing one to solve degenerate instances of stochastic games by policy iteration. 相似文献

7.

A markov chain game with dynamic information

G. J. Olsder G. P. Papavassilopoulos 《Journal of Optimization Theory and Applications》1988,59(3):467-486

Two players, not knowing each other's position, move in a domain and can flash a searchlight. The game terminates when one player is caught within the area illuminated by the flash of the other. However, if this first player is not in this area, then the other player has disclosed his position to the former one, who may be able to exploit this information. The game is considered on a finite state space and in discrete time.The work of the second author was supported by ZWO, The Netherlands Organization for the Advancement of Pure Research, Contract No. B62-239, by the US Air Force Office of Scientific Research, Grant No. AFOSR-85-0245, and by the National Science Foundation, Grant No. NSF-INT-8504097.Visiting Professor at Delft University of Technology during 1986. 相似文献

8.

Zero-sum stochastic games with stopping and control

Mrinal K. Ghosh K.S. Mallikarjuna Rao 《Operations Research Letters》2007,35(6):799-804

We study a zero-sum stochastic game where each player uses both control and stopping times. Under certain conditions we establish the existence of a saddle point equilibrium, and show that the value function of the game is the unique solution of certain dynamic programming inequalities with bilateral constraints. 相似文献

9.

An algorithm for solving a two-person game with information transfer

A. R. Mamatov 《Computational Mathematics and Mathematical Physics》2006,46(10):1699-1704

An algorithm for solving a two-person game with information transfer is proposed. The algorithm is based on a special linear programming problem. An example is given. 相似文献

10.

Markov games with incomplete information

Alexander Krausz Ulrich Rieder 《Mathematical Methods of Operations Research》1997,46(2):263-279

We consider zero-sum Markov games with incomplete information. Here, the second player is never informed about the current state of the underlying Markov chain. The existence of a value and of optimal strategies for both players is shown. In particular, we present finite algorithms for computing optimal strategies for the informed and uninformed player. The algorithms are based on linear programming results. 相似文献

11.

Robust optimization for interactive multiobjective programming with imprecise information applied to R&D project portfolio selection

Farhad Hassanzadeh Hamid Nemati Minghe Sun 《European Journal of Operational Research》2014

A multiobjective binary integer programming model for R&D project portfolio selection with competing objectives is developed when problem coefficients in both objective functions and constraints are uncertain. Robust optimization is used in dealing with uncertainty while an interactive procedure is used in making tradeoffs among the multiple objectives. Robust nondominated solutions are generated by solving the linearized counterpart of the robust augmented weighted Tchebycheff programs. A decision maker’s most preferred solution is identified in the interactive robust weighted Tchebycheff procedure by progressively eliciting and incorporating the decision maker’s preference information into the solution process. An example is presented to illustrate the solution approach and performance. The developed approach can also be applied to general multiobjective mixed integer programming problems. 相似文献

12.

On nash equilibrium solutions in nonzero-sum stochastic games with complete information

Heinz-Uwe Küenle 《International Journal of Game Theory》1994,23(4):303-324

Two-person nonzero-sum stochastic games with complete information are considered. It is shown that it is sufficient to search the equilibrium solutions in a class of deterministic strategy pairs — the so-calledintimidation strategy pairs. Furthermore, properties of the set of all equilibrium losses of such strategy pairs are proved. 相似文献

13.

On discounted stochastic games with incomplete information on payoffs and a security application

Erim Kardeş 《Operations Research Letters》2014

This paper presents a robust optimization model for n

n

-person finite state/action stochastic games with incomplete information on payoffs. For polytopic uncertainty sets, we propose an explicit mathematical programming formulation for an equilibrium calculation. It turns out that a global optimal of this mathematical program yields an equilibrium point and epsilon-equilibria can be calculated based on this result. We briefly describe an incomplete information version of a security application that can benefit from robust game theory. 相似文献

14.

Existence of optimal strategies in Markov games with incomplete information 总被引：1，自引：0，他引：1

Abraham Neyman 《International Journal of Game Theory》2008,37(4):581-596

The existence of a value and optimal strategies is proved for the class of two-person repeated games where the state follows a Markov chain independently of players’ actions and at the beginning of each stage only Player 1 is informed about the state. The results apply to the case of standard signaling where players’ stage actions are observable, as well as to the model with general signals provided that Player 1 has a nonrevealing repeated game strategy. The proofs reduce the analysis of these repeated games to that of classical repeated games with incomplete information on one side. This research was supported in part by Israeli Science Foundation grants 382/98, 263/03, and 1123/06, and by the Zvi Hermann Shapira Research Fund. 相似文献

15.

state estimation of discrete-time markov jump neural networks with general transition probabilities and output quantization

R. Sasirekha R. Rakkiyappan Ying Wan Ahmed Alsaedi 《Journal of Difference Equations and Applications》2017,23(11):1824-1852

相似文献

16.

Nonzero-sum games for continuous-time Markov chains with unbounded transition and average payoff rates

WenZhao Zhang XianPing Guo 《中国科学数学(英文版)》2012,55(11):2405-2416

This paper attempts to study two-person nonzero-sum games for denumerable continuous-time Markov chains determined by transition rates,with an expected average criterion.The transition rates are allowed to be unbounded,and the payoff functions may be unbounded from above and from below.We give suitable conditions under which the existence of a Nash equilibrium is ensured.More precisely,using the socalled "vanishing discount" approach,a Nash equilibrium for the average criterion is obtained as a limit point of a sequence of equilibrium strategies for the discounted criterion as the discount factors tend to zero.Our results are illustrated with a birth-and-death game. 相似文献

17.

Real-time order tracking for supply systems with multiple transportation stages

Nana Bryan Mandyam M. Srinivasan 《European Journal of Operational Research》2014

This paper presents a stochastic model that evaluates the value of real-time shipment tracking information for supply systems that consist of a retailer, a manufacturer, and multiple stages of transportation. The retailer aggregates demand for a single product from end customers and places orders on the manufacturer. Orders received by the manufacturer may take several time periods before they are fulfilled. Shipments dispatched by the manufacturer move through multiple stages before they reach the retailer, where each stage represents a physical location or a step in the replenishment process. The lead time for a new order depends on the number of unshipped orders at the manufacturer’s site and the number and location of all shipments in transportation. The analytic model uses real-time information on the number of orders unfulfilled at the manufacturer’s site, as well as the location of shipments to the retailer, to determine the ordering policy that minimizes the long-run average cost for the retailer. It is shown that the long-run average cost is lower with real-time tracking information, and that the cost savings are substantial for a number of situations. The model also provides some guidelines for operating this supply system under various scenarios. Numerical examples demonstrate that when there is a lack of information it is better for the retailer to order every time period, but with full information on the status in the supply system it is not always necessary for the retailer to order every time period to lower the long-run average cost. 相似文献

18.

Memory feedback controller design for stochastic Markov jump distributed delay systems with input saturation and partially known transition rates

《Nonlinear Analysis: Hybrid Systems》2015

This paper considers the problem of stabilization for a class of stochastic Markov jump distributed delay systems with partially known transition rates subject to saturating actuators. By employing local sector conditions and an appropriate Lyapunov function, a state memory feedback controller is designed to guarantee that the resulted closed-loop constrained systems are mean-square stochastic asymptotically stable. Some sufficient conditions for the solution to this problem are derived in terms of linear matrix inequalities. Finally, a numerical example is provided to demonstrate the effectiveness of the proposed method. 相似文献

19.

Hamilton Jacobi Isaacs equations for differential games with asymmetric information on probabilistic initial condition

C. Jimenez M. Quincampoix 《Journal of Mathematical Analysis and Applications》2018,457(2):1422-1451

We investigate Hamilton Jacobi Isaacs equations associated to a two-players zero-sum differential game with incomplete information. The first player has complete information on the initial state of the game while the second player has only information of a – possibly uncountable – probabilistic nature: he knows a probability measure on the initial state. Such differential games with finite type incomplete information can be viewed as a generalization of the famous Aumann–Maschler theory for repeated games. The main goal and novelty of the present work consists in obtaining and investigating a Hamilton Jacobi Isaacs Equation satisfied by the upper and the lower values of the game. Since we obtain a uniqueness result for such Hamilton Jacobi equation, as a byproduct, this gives an alternative proof of the existence of a value of the differential game (which has been already obtained in the literature by different technics). Since the Hamilton Jacobi equation is naturally stated in the space of probability measures, we use the Wasserstein distance and some tools of optimal transport theory. 相似文献