首页 | 本学科首页   官方微博 | 高级检索  
     

零行列式策略在雪堆博弈中的演化
引用本文:王俊芳,郭进利,刘瀚,沈爱忠. 零行列式策略在雪堆博弈中的演化[J]. 物理学报, 2017, 66(18): 180203-180203. DOI: 10.7498/aps.66.180203
作者姓名:王俊芳  郭进利  刘瀚  沈爱忠
作者单位:1. 上海理工大学管理学院, 上海 200093;2. 华北水利水电大学数学与统计学院, 郑州 450046;3. 西京学院商贸技术系, 西安 710123
基金项目:国家自然科学基金(批准号:71571119)和国家自然科学基金青年科学基金(批准号:11501199)资助的课题.
摘    要:零行列式策略不仅可以单方面设置对手收益,而且可以对双方的收益施加一个线性关系,从而达到敲诈对手的目的.本文针对零行列式策略博弈前期与稳态期的收益存在偏差,基于Markov链理论给出零行列式策略与全合作策略博弈的瞬态分布、瞬态收益及达到稳态所需时间.发现在小的敲诈因子下,敲诈者前期收益高于稳态期收益,敲诈因子较大时,情况截然相反,并且敲诈因子越大,越不利于双方合作,达到稳态也越慢.这为现实生活中频繁更新策略的博弈提供了一种计算实时收益的方法.此外针对敲诈策略与进化人的博弈,论证了双方均背叛状态下,进化人下次博弈时一定进化为全合作策略.通过对所有状态下策略更新过程仿真,发现进化人在四种情况下的进化速度有显著差异,并最终演化为全合作策略,表明零行列式策略是合作产生的催化剂.

关 键 词:零行列式策略  雪堆博弈  稳态分布  瞬态收益
收稿时间:2017-03-17

Evolution of zero-determinant strategy in iterated snowdrift game
Wang Jun-Fang,Guo Jin-Li,Liu Han,Shen Ai-Zhong. Evolution of zero-determinant strategy in iterated snowdrift game[J]. Acta Physica Sinica, 2017, 66(18): 180203-180203. DOI: 10.7498/aps.66.180203
Authors:Wang Jun-Fang  Guo Jin-Li  Liu Han  Shen Ai-Zhong
Affiliation:1. Business School, University of Shanghai Science and Technology, Shanghai 200093, China;2. School of Mathematics and Statistics, North China University of Water Resources and Electric Power, Zhengzhou 450046, China;3. Trade and Technology Department, Xijing University, Xi'an 710123, China
Abstract:Zero-determinant strategy can set unilaterally or enforce a linear relationship on opponent's income, thereby achieving the purpose of blackmailing the opponent. So one can extort an unfair share from the opponent. Researchers often pay attention to the steady state and use the scores of the steady state in previous work. However, if the player changes his strategy frequently in daily game, the steady state cannot attain easily. It is necessary to attain the transient income if there is a difference in income between the previous state and the steady state. In addition, what will happen if evolutionary player encounters an extortioner? The evolutionary results cannot be proven, just using the simulations in previous work. Firstly, for the iterated game between extortioner and cooperator, we introduce the transient distribution, the transient income, and the arrival time to steady state by using the Markov chain theory. The results show that the extortioner's payoff in the previous state is higher than in the steady state when the extortion factor is small, and the results go into reverse when the extortion factor is large. Furthermore, the larger the extortion factor, the harder the cooperation will be. And the small extortion factor conduces to approaching the steady state earlier. The results provide a method to calculate the dynamic incomes of both sides and give us a time scale of reaching the steady state. Secondly, for the iterated game between extortioner and evolutionary player, we prove that the evolutionary player must evolve into a full cooperation strategy if he and his opponent are both defectors in the initial round. Then, supposing that the evolutionary speed is proportional to the gradient of his payoff, we simulate the evolutionary paths. It can be found that the evolutionary speeds are greatly different in four initial states. In particular, the evolutionary player changes his strategy into cooperation rapidly if he defects in the initial round. He also gradually evolves into a cooperator if he cooperates in the initial round. That is to say, the evolutionary process relates to his initial behavior, but the result is irrelevant to his behavior. It can be concluded that the zero-determinant strategy acts as a catalyst in promoting cooperation. Finally, we prove that the set of zero-determinant strategy and fully cooperation is not a Nash equilibrium.
Keywords:zero-determinant strategy  snowdrift game  stationary distribution  transient income
本文献已被 CNKI 等数据库收录!
点击此处可从《物理学报》浏览原始摘要信息
点击此处可从《物理学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号