期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Constrained markov decision processes with compact state and action spaces: the average case

《Optimization》2012,61(2):255-269

Constrained Markov decision processes with compact state and action spaces are studied under long-run average reward or cost criteria. By introducing a corresponding Lagrange function, a saddle-point theorem is given, by which the existence of a constrained optimal pair of initial state distribution and policy is shown. Also, under the hypothesis of Doeblin, the functional characterization of a constrained optimal policy is obtained 相似文献

2.

Optimality equations and sensitive optimality in bounded Markov decision processes 1

《Optimization》2012,61(5):767-781

This paper consider Markov decision processes with countable state space, compact action spaces and a bounded reward function. Under some recurrence and connectedness condition, including the simultaneous Döblin condition, we prove the existence of bounded solutions of the optimality equations which arise for the multichain case in connection with the average reward criterion and sensitive optimality criteria, and we give a characterization of the sets of n-average optimal decision rules. 相似文献

3.

Denumerable state stochastic games with limiting average payoff

V. S. Borkar M. K. Ghosh 《Journal of Optimization Theory and Applications》1993,76(3):539-560

We study stochastic games with countable state space, compact action spaces, and limiting average payoff. ForN-person games, the existence of an equilibrium in stationary strategies is established under a certain Liapunov stability condition. For two-person zero-sum games, the existence of a value and optimal strategies for both players are established under the same stability condition.The authors wish to thank Prof. T. Parthasarathy for pointing out an error in an earlier version of this paper. M. K. Ghosh wishes to thank Prof. A. Arapostathis and Prof. S. I. Marcus for their hospitality and support. 相似文献

4.

An operator solution of stochastic games

A. Maitra W. Sudderth 《Israel Journal of Mathematics》1992,78(1):33-49

A class of zero-sum, two-person stochastic games is shown to have a value which can be calculated by transfinite iteration of an operator. The games considered have a countable state space, finite action spaces for each player, and a payoff sufficiently general to include classical stochastic games as well as Blackwell’s infiniteG _δ games of imperfect information. Research supported by National Science Foundation Grants DMS-8801085 and DMS-8911548. 相似文献

5.

Stochastic approximations of constrained discounted Markov decision processes

François Dufour Tomás Prieto-Rumeau 《Journal of Mathematical Analysis and Applications》2014

We consider a discrete-time constrained Markov decision process under the discounted cost optimality criterion. The state and action spaces are assumed to be Borel spaces, while the cost and constraint functions might be unbounded. We are interested in approximating numerically the optimal discounted constrained cost. To this end, we suppose that the transition kernel of the Markov decision process is absolutely continuous with respect to some probability measure μ . Then, by solving the linear programming formulation of a constrained control problem related to the empirical probability measure _μ_n

μ_{n}

of μ, we obtain the corresponding approximation of the optimal constrained cost. We derive a concentration inequality which gives bounds on the probability that the estimation error is larger than some given constant. This bound is shown to decrease exponentially in n. Our theoretical results are illustrated with a numerical application based on a stochastic version of the Beverton–Holt population model. 相似文献

6.

Continuous-time constrained stochastic games with average criteria

Wenzhao Zhang Binfu Wang Dewang Chen 《Operations Research Letters》2018,46(1):109-115

In this paper, we consider the continuous-time nonzero-sum stochastic games under the constrained average criteria. The state space is denumerable and the action space of each player is a general Polish space. The transition rates, reward and cost functions are allowed to be unbounded. The main hypotheses in this paper include the standard drift conditions, continuity-compactness condition and some ergodicity assumptions. By applying the vanishing discount method, we obtain the existence of stationary constrained average Nash equilibria. 相似文献

7.

Stochastic games with metric state space

H. A. M. Couwenbergh 《International Journal of Game Theory》1980,9(1):25-36

In this paper the stochastic two-person zero-sum game of Shapley is considered, with metric state space and compact action spaces. It is proved that both players have stationary optimal strategies, under conditions which are weaker than those ofMaitra/Parthasarathy (a.o. no compactness of the state space). This is done in the following way: we show the existence of optimal strategies first for the one-period game with general terminal reward, then for then-period games (n=1,2,...); further we prove that the game over the infinite horizon has a valuev, which is the limit of then-period game values. Finally the stationary optimal strategies are found as optimal strategies in the one-period game with terminal rewardv. 相似文献

8.

Nonzero-sum constrained discrete-time Markov games: the case of unbounded costs

Wenzhao Zhang Yonghui Huang Xianping Guo 《TOP》2014,22(3):1074-1102

In this paper, we consider discrete-time \(N\) -person constrained stochastic games with discounted cost criteria. The state space is denumerable and the action space is a Borel set, while the cost functions are admitted to be unbounded from below and above. Under suitable conditions weaker than those in (Alvarez-Mena and Hernández-Lerma, Math Methods Oper Res 63:261–285, 2006) for bounded cost functions, we also show the existence of a Nash equilibrium for the constrained games by introducing two approximations. The first one, which is as in (Alvarez-Mena and Hernández-Lerma, Math Methods Oper Res 63:261–285, 2006), is to construct a sequence of finite games to approximate a (constrained) auxiliary game with an initial distribution that is concentrated on a finite set. However, without hypotheses of bounded costs as in (Alvarez-Mena and Hernández-Lerma, Math Methods Oper Res 63:261–285, 2006), we also establish the existence of a Nash equilibrium for the auxiliary game with unbounded costs by developing more shaper error bounds of the approximation. The second one, which is new, is to construct a sequence of the auxiliary-type games above and prove that the limit of the sequence of Nash equilibria for the auxiliary-type games is a Nash equilibrium for the original constrained games. Our results are illustrated by a controlled queueing system. 相似文献

9.

Average sample-path optimality for continuous-time Markov decision processes in Polish spaces

Quan-xin Zhu 《应用数学学报(英文版)》2011,27(4):613-624

In this paper we study the average sample-path cost(ASPC) problem for continuous-time Markov decision processes in Polish spaces.To the best of our knowledge,this paper is a first attempt to study the ASPC criterion on continuous-time MDPs with Polish state and action spaces.The corresponding transition rates are allowed to be unbounded,and the cost rates may have neither upper nor lower bounds.Under some mild hypotheses,we prove the existence of ε(ε≥ 0)-ASPC optimal stationary policies based on two differe... 相似文献

10.

On the existence of almost-pure-strategy Nash equilibrium in <Emphasis Type="Italic">n</Emphasis>-person finite games

Wojciech Połowczuk Piotr Więcek Tadeusz Radzik 《Mathematical Methods of Operations Research》2007,65(1):141-152

This paper gives wide characterization of n-person non-coalitional games with finite players’ strategy spaces and payoff functions having some concavity or convexity properties. The characterization is done in terms of the existence of two-point-strategy Nash equilibria, that is equilibria consisting only of mixed strategies with supports being one or two-point sets of players’ pure strategy spaces. The structure of such simple equilibria is discussed in different cases. The results obtained in the paper can be seen as a discrete counterpart of Glicksberg’s theorem and other known results about the existence of pure (or “almost pure”) Nash equilibria in continuous concave (convex) games with compact convex spaces of players’ pure strategies. 相似文献

11.

Structural stability and robustness to bounded rationality for non-compact cases 总被引：2，自引：0，他引：2

Jian Yu Hui Yang Chao Yu 《Journal of Global Optimization》2009,44(1):149-157

We study the model M consisting of “general games” with noncompact action space, together with an associated abstract rationality function. We prove that M is structurally stable and robust to ϵ-equilibria for “almost all” parameters. As applications, we investigate structural stability and robustness to bounded rationality for noncooperative games, multiobjective optimizations and fixed point problems satisfying existence and some continuity conditions. Specifically, we introduce concrete rationality functions for such three kinds of problems with both payoffs and strategy sets, objective functions and domain spaces, and correspondence and domain spaces as parameters, respectively, and show the generic structural stability and robustness to bounded rationality for the corresponding model Ms. 相似文献

12.

Nonzero-sum non-stationary discounted Markov game model 总被引：1，自引：0，他引：1

Rangcheng Jia Yuanyao Ding 《Mathematical Methods of Operations Research》2000,52(2):265-270

The goal of this paper is provide a theory of K-person non-stationary Markov games with unbounded rewards, for a countable state space and action spaces. We investigate both the finite and infinite horizon problems. We define the concept of strong Nash equilibrium and present conditions for both problems for which strong Nash or Nash equilibrium strategies exist for all players within the Markov strategies, and show that the rewards in equilibrium satisfy the optimality equations. 相似文献

13.

A more refined convexity idea for Nash equilibria

Francesco Ciardiello 《Journal of Fixed Point Theory and Applications》2010,8(2):365-393

In this paper, we relax the classical quasi-concavity assumption for the existence of pure Nash equilibria in the setting of constrained and unconstrained games in normal form. Multiconnected convexity (H. Ben-El-Mechaiekh et al., 1998) in spaces without any linear structure is a keen point. We present two games in which we show how the generalized continuity and quasi-concavity hypotheses are unrelated to each other as sufficient conditions for existence of Nash equilibria for games in normal form. Then our results are applied to two non-zero-sum games lacking the classical quasi-concavity assumption (Nash, 1950) and the more recent improvements (Ziad, 1999) and (Abalo and Kostreva, 2004). As minor results, we introduce new concept of convexity, named a-convexity, and some counterexamples of the relationships between some continuity conditions on players’ payoffs imposed by Lignola (1997), Reny (1999) and Simon (1987). 相似文献

14.

Stochastic games for distributed players on graphs

Ruslan K.?Chornei Hans?Daduna Email author Pavel S.?Knopov 《Mathematical Methods of Operations Research》2004,60(2):279-298

相似文献

15.

An operator approach to zero-sum repeated games

Dinah Rosenberg Sylvain Sorin 《Israel Journal of Mathematics》2001,121(1):221-246

We consider two person zero-sum stochastic games. The recursive formula for the valuesvλ (resp.v _n) of the discounted (resp. finitely repeated) version can be written in terms of a single basic operator Φ(α,f) where α is the weight on the present payoff andf the future payoff. We give sufficient conditions in terms of Φ(α,f) and its derivative at 0 for limv _n and limvλ to exist and to be equal. We apply these results to obtain such convergence properties for absorbing games with compact action spaces and incomplete information games. 相似文献

16.

Stochastic games on a product state space: the periodic case

János Flesch Gijs Schoenmakers Koos Vrieze 《International Journal of Game Theory》2009,38(2):263-289

We examine so-called product-games. These are n-player stochastic games played on a product state space S ¹ × ... × S ⁿ, in which player i controls the transitions on S ⁱ. For the general n-player case, we establish the existence of 0-equilibria. In addition, for the case of two-player zero-sum games of this type, we show that both players have stationary 0-optimal strategies. In the analysis of product-games, interestingly, a central role is played by the periodic features of the transition structure. Flesch et al. (Math Oper Res 33, 403–420, 2008) showed the existence of 0-equilibria under the assumption that, for every player i, the transition structure on S ⁱ is aperiodic. In this article, we examine product-games with periodic transition structures. Even though a large part of the approach in Flesch et al. (Math Oper Res 33, 403–420, 2008) remains applicable, we encounter a number of tricky problems that we have to address. We provide illustrative examples to clarify the essence of the difference between the aperiodic and periodic cases. 相似文献

17.

Enumerating Constrained Non-crossing Minimally Rigid Frameworks 总被引：2，自引：1，他引：1

David Avis Naoki Katoh Makoto Ohsaki Ileana Streinu Shin-ichi Tanigawa 《Discrete and Computational Geometry》2008,40(1):31-46

In this paper we present an algorithm for enumerating without repetitions all the non-crossing generically minimally rigid bar-and-joint frameworks under edge constraints, which we call constrained non-crossing Laman frameworks, on a given set of n points in the plane. Our algorithm is based on the reverse search paradigm of Avis and Fukuda. It generates each output graph in O(n ⁴) time and O(n) space, or, with a slightly different implementation, in O(n ³) time and O(n ²) space. In particular, we obtain that the set of all the constrained non-crossing Laman frameworks on a given point set is connected by flips which preserve the Laman property. D. Avis’s research was supported by NSERC and FQRNT grants. N. Katoh’s, M. Ohsaki’s and S.-i. Tanigawa’s research was supported by NEXT Grant-in-Aid for Scientific Research on priority areas of New Horizons in Computing. I. Streinu’s research was supported by NSF grant CCF-0430990 and NSF-DARPA CARGO CCR-0310661. 相似文献

18.

Average optimality inequality for continuous-time Markov decision processes in Polish spaces

Quanxin Zhu 《Mathematical Methods of Operations Research》2007,66(2):299-313

In this paper, we study the average optimality for continuous-time controlled jump Markov processes in general state and action spaces. The criterion to be minimized is the average expected costs. Both the transition rates and the cost rates are allowed to be unbounded. We propose another set of conditions under which we first establish one average optimality inequality by using the well-known “vanishing discounting factor approach”. Then, when the cost (or reward) rates are nonnegative (or nonpositive), from the average optimality inequality we prove the existence of an average optimal stationary policy in all randomized history dependent policies by using the Dynkin formula and the Tauberian theorem. Finally, when the cost (or reward) rates have neither upper nor lower bounds, we also prove the existence of an average optimal policy in all (deterministic) stationary policies by constructing a “new” cost (or reward) rate. Research partially supported by the Natural Science Foundation of China (Grant No: 10626021) and the Natural Science Foundation of Guangdong Province (Grant No: 06300957). 相似文献

19.

Pareto equilibria for generalized constrained multiobjective games in FC-spaces without local convexity structure

Xie Ping Ding 《Nonlinear Analysis: Theory, Methods & Applications》2009,71(11):5229-5237

In this paper, we further study a class of generalized constrained multiobjective games where the number of players may be finite or infinite, the strategy sets may be general FC-spaces without local convexity structure, and all payoff functions get their values in infinite-dimensional topological vector spaces. By using an existence theorem of maximal elements for a family of set-valued mappings in FC-spaces due to the author, an existence theorem of solutions for a system of generalized vector quasivariational inclusions is first proved in general FC-spaces. By applying the existence result of solutions of the system of generalized vector quasivariational inclusions, some existence theorems of (weak) Pareto equilibria for the generalized constrained multiobjective games are established in noncompact product FC-spaces. Some special cases of our results are also discussed. Our results are new and different from the corresponding known results in the literature. 相似文献

20.

Probabilities of Pure Nash Equilibria in Matrix Games when the Payoff Entries of One Player Are Randomly Selected

J. M. Peterson M. A. Simaan 《Journal of Optimization Theory and Applications》2008,137(2):401-410

The Nash equilibrium in pure strategies represents an important solution concept in nonzero sum matrix games. Existence of Nash equilibria in games with known and with randomly selected payoff entries have been studied extensively. In many real games, however, a player may know his own payoff entries but not the payoff entries of the other player. In this paper, we consider nonzero sum matrix games where the payoff entries of one player are known, but the payoff entries of the other player are assumed to be randomly selected. We are interested in determining the probabilities of existence of pure Nash equilibria in such games. We characterize these probabilities by first determining the finite space of ordinal matrix games that corresponds to the infinite space of matrix games with random entries for only one player. We then partition this space into mutually exclusive spaces that correspond to games with no Nash equilibria and with r Nash equilibria. In order to effectively compute the sizes of these spaces, we introduce the concept of top-rated preferences minimal ordinal games. We then present a theorem which provides a mechanism for computing the number of games in each of these mutually exclusive spaces, which then can be used to determine the probabilities. Finally, we summarize the results by deriving the probabilities of existence of unique, nonunique, and no Nash equilibria, and we present an illustrative example. 相似文献