首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Competing populations of finite automata co‐evolve in an evolutionary algorithm to play two player games. Populations endowed with greater complexity do better against their less complex opponents in a strictly competitive constant sum game. In contrast, complexity determines efficiency levels, but not relative earnings, in a Prisoner's Dilemma game; greater levels of complexity result in mutually higher earnings. With reporting noise, advantages to complexity are lost and efficiency levels are reduced as relatively less complex strategies are selected. © 2004 Wiley Periodicals, Inc. Complexity 9: 71–78, 2004  相似文献   

2.
Earlier theoretical accounts of collective learning relied on rules and operating procedures as the organizational memory (March in Organ. Sci. 2(1):71–87, 1991; Rodan in Scand. J. Manag. 21:407–428, 2005). This paper builds on this tradition drawing on ideas from social network theory. Learning is modeled as a social-psychological process (Darr and Kurtzberg in Organ. Behav. Hum. Decis. Process. 82(1):28–44, 2000; Rulke et al. in Organ. Behav. Hum. Decis. Process. 82(1):134–149, 2000), in which organizations learn by exchanging information internally between their members (Argote et al. in Organ. Behav. Hum. Decis. Process. 82(1):1–8, 2000; Carley in Am. Soc. Rev. 56(3):331–354, 1991; Carley in Soc. Perspect. 48(4):547–571, 1995). Learning is also characterized as stochastic and creative (Gruenfeld et al. in Organ. Behav. Hum. Decis. Process. 82(1):45–59, 2000). This model is used to explore predictions about the effect social networks have on idea generation and learning and alternative strategies for choosing from whom to seek information.
Simon RodanEmail:
  相似文献   

3.
The CPR (“cumulative proportional reinforcement”) learning rule stipulates that an agent chooses a move with a probability proportional to the cumulative payoff she obtained in the past with that move. Previously considered for strategies in normal form games (Laslier, Topol and Walliser, Games and Econ. Behav., 2001), the CPR rule is here adapted for actions in perfect information extensive form games. The paper shows that the action-based CPR process converges with probability one to the (unique) subgame perfect equilibrium.Received: October 2004  相似文献   

4.
Decision makers in dynamic environments such as air traffic control, firefighting, and call center operations adapt in real-time using outcome feedback. Understanding this adaptation is important for influencing and improving the decisions made. Recently, stimulus-response (S-R) learning models have been proposed as explanations for decision makers' adaptation. S-R models hypothesize that decision makers choose an action option based on their anticipation of its success. Decision makers learn by accumulating evidence over action options and combining that evidence with prior expectations. This study examines a standard S-R model and a simple variation of this model, in which past experience may receive an extremely low weight, as explanations for decision makers' adaptation in an evolving Internet-based bargaining environment. In Experiment 1, decision makers are taught to predict behavior in a bargaining task that follows rules that may be the opposite of, congruent to, or unrelated to a second task in which they must choose the deal terms they will offer. Both models provide a good account of the prediction task. However, only the second model, in which decision makers heavily discount all but the most recent past experience, provides a good account of subsequent behavior in the second task. To test whether Experiment 1 artificially related choice behavior and prediction, a second experiment examines both models' predictions concerning the effects of bargaining experience on subsequent prediction. In this study, decision models where long-term experience plays a dominating role do not appear to provide adequate explanations of decision makers' adaptation to their opponent's changing response behavior.  相似文献   

5.
This paper deals with modelling and hierarchical learning control of an industrial phosphate dryer. The model is derived from heat and mass balances. It consists of hyperbolic nonlinear partial differential equations. The control and the output variables appear at the boundary conditions. The method of characteristics that is based on two independent variables is used for numerical simulation purposes. The control objectives are to minimize fuel consumption and to keep the moisture content of dried phosphate close to a desired value, despite external perturbations acting on the drying process. The fuel flow and the dried product moisture content have been selected as control variables. A hierarchical learning system, operating in a random environment that corresponds to the dryer, is used for control purposes. During its operation the learning system collects the pertinent information about the variables that describe the process to be controlled and generates a control action. The obtained results show the good performance characteristics of the considered controller.  相似文献   

6.
This paper presents experimental results from an analysis of two similar games, the repeated ultimatum game and the repeated best-shot game. The experiment examines whether the amount and content of information given to players affects the evolution of play in the two games. In one experimental treatment, subjects in both games observe not only their own actions and payoffs, but also those of one randomly chosen pair of players in the just-completed round of play. In the other treatment, subjects in both games observe only their own actions and payoffs. We present evidence suggesting that observation of other players' actions and payoffs may affect the evolution of play relative to the case of no observation. Received February 1996/Final version April 1998  相似文献   

7.
We analyze the learning behavior of a Simple Genetic Algorithm in symmetric 3 × 3 Strategic-Form-Games. In cases of contests within one population and also between two populations the behavior of the SGA is compared with the behavior of the replicator dynamics and is analyzed with respect to equilibrium concepts in evolutionary game theory. Furthermore conservative non-adaptive strings are added to the population which lead to convergence to an equilibrium even in “GA-deceptive” games where the equilibrium can not be reached by GAs using only selection and crossover.  相似文献   

8.
约束商品经营者做虚假广告的两种博弈分析   总被引:5,自引:0,他引:5  
广告是消费者了解商品的种类、价格和性能的重要信息途径 ,也是商品经营者重要的营销手段 .但是 ,目前社会上存在着商品经营者做虚假广告以追求超额利润的现象 .本文从政府监督部门与商品经营者之间、消费者与商品经营者之间的关系出发 ,对约束经营者做虚假广告作出博弈分析 ,并进一步讨论了消费者和经营者的多阶段 (无限次重复 )博弈等情况  相似文献   

9.
大量经济学实验研究证实了公平关切和学习效应对决策者行为的影响力。本文研究三人组供应链系统,通过区别设计个体自我学习以及社会学习的实验环境,对比考察备用供应商的公平关切程度,以及制造商和备用供应商学习曲线的特点。实验结果支持了学习效应存在的假设:随着实验期数的增加,单期决策时间逐渐减少,备用供应商的整体拒绝率逐渐降低,制造商的策略逐渐集中。进一步构建了引入公平关切的强化学习模型。通过参数估计发现在个体自我学习和社会学习实验环境下,备用供应商的横向公平关切程度均较为显著,信息共享对备用供应商的横向公平关切偏好无明显影响。  相似文献   

10.
重复n人随机合作对策的核心   总被引:1,自引:0,他引:1  
以Su ijs等人(1995)引入的随机合作对策的模型为基础,建立了重复n人随机合作对策的理论,定义了重复n人随机合作对策的支付序列以及支付序列的优超关系,并由此给出了重复n人随机合作对策的核心、超可加性和凸性的定义,并讨论了该核心的一些特征和性质.  相似文献   

11.
建立了一个非对称信息下的重复博弈模型来刻画股票市场中庄家和散户的博弈行为,推导出股票价格的折现过程服从一个由布朗运动驱动的鞅过程,并给出股票价格随机变动的内生性解释:在博弈过程中庄家为了隐藏散户所不知道的信息采用随机化策略来迷惑对手,从而导致股票价格的随机变动.在此基础上,进一步研究了相应的期权定价问题并给出期权定价公式.  相似文献   

12.
Order Acceptance (OA) is one of the main functions in business control. Accepting an order when capacity is available could disable the system to accept more profitable orders in the future with opportunity losses as a consequence. Uncertain information is also an important issue here. We use Markov decision models and learning methods from Artificial Intelligence to find decision policies under uncertainty. Reinforcement Learning (RL) is quite a new approach in OA. It is shown here that RL works well compared with heuristics. It is demonstrated that employing an RL trained agent is a robust, flexible approach that in addition can be used to support the detection of good heuristics.  相似文献   

13.
The convergence properties for reinforcement learning approaches, such as temporal differences and Q-learning, have been established under moderate assumptions for discrete state and action spaces. In practice, however, many systems have either continuous action spaces or a large number of discrete elements. This paper presents an approximate dynamic programming approach to reinforcement learning for continuous action set-point regulator problems, which learns near-optimal control policies based on scalar performance measures. The continuous-action space (CAS) algorithm uses derivative-free line search methods to obtain the optimal action in the continuous space. The theoretical convergence properties of the algorithm are presented. Several heuristic stopping criteria are investigated and practical application is illustrated by two example problems—the inverted pendulum balancing problem and the power system stabilization problem.  相似文献   

14.
Digital games (e.g., video games or computer games) have been reported as an effective educational method that can improve students' motivation and performance in mathematics education. This meta‐analysis study (a) investigates the current trend of digital game‐based learning (DGBL) by reviewing the research studies on the use of DGBL for mathematics learning, (b) examines the overall effect size of DGBL on K‐12 students' achievement in mathematics learning, and (c) discusses future directions for DGBL research in the context of mathematics learning. In total, 296 studies were collected for the review, but of those studies, only 33 research studies were identified as empirical studies and systematically analyzed to investigate the current research trends. In addition, due to insufficient statistical data, only 17 out of the 33 studies were analyzed to calculate the overall effect size of digital games on mathematics education. This study will contribute to the research community by analyzing recent trends in significant DGBL research, especially for those who are interested in using DGBL for mathematics education.  相似文献   

15.
在重复n人随机合作对策中定义了τ-优超的概念,并通过运用τ-优超的概念对重复n人随机合作对策中的核心进行了精炼,即定义了重复n人随机合作对策的τ-核心.最后给出了重复n人随机合作对策τ-核心与核心的关系及τ-核心所满足的性质和特征.  相似文献   

16.
基于随机需求函数,讨论了相互竞争的两厂商实施三度价格歧视无限次重复博弈和不定次重复博弈的均衡分析,在三度价格歧视无限次重复博弈分析中,得出了两厂商在贴现因子影响下的子博弈完美纳什均衡.在三度价格歧视不定次重复博弈分析中,设计了不同的方案并进行了Matlab仿真,比较了不同贴现因子下不同仿真方案的厂商得益,并对仿真结果进行了系统分析,得到了统计意义下的均衡分析.  相似文献   

17.
产品替代性程度与企业共谋合作关系的研究   总被引:1,自引:0,他引:1  
通过建立重复博弈数学模型,从数量上研究了产品替代性程度与企业长期共谋合作之间的关系.研究结果表明:在线性需求条件下,产品替代性程度增加,企业长期共谋合作相对比较困难,但是从总体上看,产品替代性程度对企业间长期是否共谋合作的影响很小.本文研究结果可为政府当前正在准备反垄断法的制订工作提供一定的理论支持.  相似文献   

18.
19.
This paper deals with a duel with time lag that has the following structure: Each of two players I and II has a gun with one bullet and he can fire his bullet at any time in [0, 1], aiming at this opponent. The gun of player I is silent and the gun of player II is noisy with time lagt (i.e., if player II fires at timex, then player I knows it at timex+t). They both have equal accuracy functions. Furthermore, if player I hits player II without being hit himself before, the payoff is +1; if player I is hit by player II without hitting player II before, the payoff is –1; if they hit each other at the same time or both survive, the payoff is 0.This paper gives the optimal strategy for each player, the game value, and some examples.  相似文献   

20.
We present a complete solution to a card game with historical origins. Our analysis exploits the convexity properties in the payoff matrix, allowing this discrete game to be resolved by continuous methods.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号