Asymmetric Iterated Prisoner’s Dilemma on BA Scale-Free Network

Yunhao Ding111First author Chunyan Zhang222Second author Jianlei Zhang jianleizhang@nankai.edu.cn
Abstract

In real-world scenarios, individuals often cooperate for mutual benefit. However, differences in wealth, reputation, and rationality can lead to varying outcomes for similar actions. Besides, in complex social networks, an individual’s choices are frequently influenced by their neighbors. To explore the evolution of strategies in realistic settings, we conducted repeated asymmetric prisoner’s dilemma experiments on a weighted Barabási-Albert (BA) scale-free network using a memory-one strategy framework. First, our analysis highlighted how the four components of memory-one strategies affect win rates. Second, during strategy evolution on the network, two key strategies emerged: ”self-bad, partner-worse” and ”altruist”. Finally, by introducing optimization mechanisms, we increased the cooperation levels among individuals within the group. These findings offer practical insights for addressing real-world problems.

keywords:
Iterated Prisoner’s Dilemma , Evolutionary Game , BA Scale-Free Network , Cooperation
\affiliation

[a]organization=Department of Automation, College of Artificial Intelligence,addressline=Nankai University, city=Tianjin, postcode=300071, country=China

\affiliation

[b]organization=Tianjin Key Labortory of Intelligence Robotics,addressline=Nankai University, city=Tianjin, postcode=300071, country=China

{highlights}

We analyze and compare the characters of each component in the framework of memory-one strategy.

We find ”altruists” strategy and ”self-bad, partner-worse” strategy within an iterated asymmetric prisoner’s dilemma game on weighted BA scale-free network.

We explore methods to enhance the average fitness of the population.

1 Introduction

Cooperation refers to the behavior where individuals coordinate to achieve better outcomes driven by common interests[1]. In the biological realm, cooperative behaviors are ubiquitous, ranging from foraging activities among animals to relations between nations[2]. To study the impact of individuals’ choices to cooperate or not under complex conditions on the benefits to both parties, game theory and evolutionary game theory have emerged successively[3]. Game theory provides the mathematical framework for analyzing scenarios characterized by conflict or competition. Evolutionary game theory, a branch of game theory, integrates concepts from evolutionary biology to explore strategic choices within a population and the dynamic processes of behavioral evolution[4, 5].

The Prisoner’s Dilemma (PD) is a classic model in game theory, originating from a scenario involving two captured prisoners who are unable to communicate with each other. PD presents a seemingly paradoxical problem: when faced with the choice between betrayal and cooperation, the rational choice for each prisoner is to betray, because, regardless of the other’s decision, confessing yields the best individual outcome. However, if both prisoners choose to betray, they will end up with a worse outcome than if they both had cooperated[6, 7, 8].

In the 1980s, Robert Axelrod organized two tournaments to study the performance of various strategies in the iterated Prisoner’s Dilemma (IPD) and to determine which strategies could balance cooperation and betrayal[9]. In IPD studies, strategies are often endowed with a degree of ”memory,” allowing individuals to recall outcomes of several previous rounds. It is generally believed that players with stronger memory perform better in repeated games. However, research indicates that long-term memory does not significantly advantage over short-term memory[10]. As a result, memory-one strategies have become the most widely used framework in repeated games. Scholars have proposed several strategies within this framework, highlighting their benefits in specific contexts. In the aforementioned tournaments, a simple Tit-for-Tat (TFT) strategy won consecutively. The TFT strategy involves cooperating in the first round and then replicating the opponent’s action from the previous round. This cooperative approach yielded excellent results by not initiating betrayal but responding to it, thus balancing cooperation and punishment[11, 12]. Inspired by TFT, Robert Axelrod proposed the Generous-TFT (GTFT) strategy, which also starts with cooperation and continues if the opponent does. However, unlike TFT, GTFT forgives the opponent’s betrayal with a certain probability[13]. This strategy maintains TFT’s ability to establish cooperation while adding tolerance, helping to avoid vicious cycles and promoting long-term cooperation. In the 1950s, psychologist Donald Hebb introduced the concept of Win-Stay, Lose-Shift (WSLS). In the 1990s, Nowak and Sigmund formally defined this strategy. Its principle is simple: if the previous round’s result was favorable, maintain the same decision in the current round; otherwise, change the decision[14, 15]. In 2012, Press and Dyson introduced the Zero-Determinant (ZD) strategy, which can unilaterally control the opponent’s payoff and enforce a linear relationship between their payoffs. Unlike the aforementioned strategies with clear rules, the ZD strategy encompasses a cluster of strategies based on repeated games[16]. Notably, its payoff is the expected long-term payoff rather than the exact payoff in any specific round.

Complex networks are systems composed of numerous interconnected nodes, which can be individuals, organizations, or other social units in reality. In these networks, the decisions and behaviors of individuals are often influenced by their surrounding nodes, leading to complex interactions and dynamic evolution[17, 18, 19]. Studying evolutionary games on complex networks allows us to better understand the dynamics of interactions, cooperation, and competition among individuals, offering solutions to real-world social problems.

Game theory typically assumes that participants are completely rational and symmetric. However, in reality, participants often differ in identity, characters and assets, which significantly influence their decisions and payoffs[20]. Introducing these differences makes game models more realistic and capable of accurately reflecting the complex interactions in the real world[21, 22, 23, 24]. These participants have varying goals and resources in the game, leading to different strategies. For instance, resource-rich participants may be more willing to take risks, while resource-limited participants might prefer conservative strategies. By considering differences in varied attributes, more complex and optimized game models can be designed, resulting in fairer and more effective solutions.

The main structure of this paper is as follows: Chapter 1 is a brief introduction to game theory, the prisoner’s dilemma, complex networks, and asymmetric games. Chapter 2 describes the models and experimental procedures used in the study in detail. Chapter 3 presents and analyzes the experimental results. Chapter 4 summarizes the work conducted in the paper.

2 Models and Settings

2.1 The Prisoner’s Dilemma

The Prisoner’s Dilemma (PD) is a classic game theory model. In the traditional PD, there are two participants, X and Y, each with the same options: cooperate (C) or defect (D). Let the cost of cooperation be denoted as c𝑐citalic_c, and the benefit obtained be denoted as b𝑏bitalic_b[25, 26]. In a single round, if both X and Y choose to cooperate, they both receive the same payoff R(reward)=bc𝑅𝑟𝑒𝑤𝑎𝑟𝑑𝑏𝑐R(reward)=b-citalic_R ( italic_r italic_e italic_w italic_a italic_r italic_d ) = italic_b - italic_c. If X chooses to cooperate while Y chooses to defect, the naive cooperator X incurs the cost of cooperation, resulting in payoff S(sucker)=c𝑆𝑠𝑢𝑐𝑘𝑒𝑟𝑐S(sucker)=-citalic_S ( italic_s italic_u italic_c italic_k italic_e italic_r ) = - italic_c, while the greedy defector Y avoids the cooperation cost and directly gains payoff T(temptation)=b𝑇𝑡𝑒𝑚𝑝𝑡𝑎𝑡𝑖𝑜𝑛𝑏T(temptation)=bitalic_T ( italic_t italic_e italic_m italic_p italic_t italic_a italic_t italic_i italic_o italic_n ) = italic_b. If both X and Y choose to defect, neither incurs the cost, and neither gains the benefit, resulting in a payoff P(punish)=0𝑃𝑝𝑢𝑛𝑖𝑠0P(punish)=0italic_P ( italic_p italic_u italic_n italic_i italic_s italic_h ) = 0 for both. Generally, it holds that b>c>0𝑏𝑐0b>c>0italic_b > italic_c > 0 and T>R>P>S𝑇𝑅𝑃𝑆T>R>P>Sitalic_T > italic_R > italic_P > italic_S[27, 28]. In this paper, we set b=4𝑏4b=4italic_b = 4 and c=1𝑐1c=1italic_c = 1, yielding the following payoff matrix.

C D
C R(3) S(-1)
D T(4) P(0)
Table 1: Payoff Matrix

2.2 Memory-one Strategy

In this study, all individuals are assumed to adopt a memory-one strategy p=(pR,pS,pT,pP)psubscript𝑝𝑅subscript𝑝𝑆subscript𝑝𝑇subscript𝑝𝑃\textbf{p}=(p_{R},p_{S},p_{T},p_{P})p = ( italic_p start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT , italic_p start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT , italic_p start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT , italic_p start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT ) for their interactions. The four parameters in this model correspond to the probability that an individual will choose to cooperate in the current round, based on the outcomes of the previous round being XY=CCXY𝐶𝐶\textbf{XY}=CCXY = italic_C italic_C, XY=CDXY𝐶𝐷\textbf{XY}=CDXY = italic_C italic_D, XY=DCXY𝐷𝐶\textbf{XY}=DCXY = italic_D italic_C and XY=DDXY𝐷𝐷\textbf{XY}=DDXY = italic_D italic_D respectively. Besides, these probabilities satisfy the condition pR[0,1]subscript𝑝𝑅01p_{R}\in[0,1]italic_p start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ∈ [ 0 , 1 ], pS[0,1]subscript𝑝𝑆01p_{S}\in[0,1]italic_p start_POSTSUBSCRIPT italic_S end_POSTSUBSCRIPT ∈ [ 0 , 1 ], pT[0,1]subscript𝑝𝑇01p_{T}\in[0,1]italic_p start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ∈ [ 0 , 1 ] and pP[0,1]subscript𝑝𝑃01p_{P}\in[0,1]italic_p start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT ∈ [ 0 , 1 ] [29]. Specifically, to distinguish the strategies of both parties in the game, when X and Y engage in a repeated PD, the strategy of individual X is denoted as p=(p1,p2,p3,p4)psubscript𝑝1subscript𝑝2subscript𝑝3subscript𝑝4\textbf{p}=(p_{1},p_{2},p_{3},p_{4})p = ( italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_p start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_p start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT , italic_p start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT ), where p1subscript𝑝1p_{1}italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, p2subscript𝑝2p_{2}italic_p start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, p3subscript𝑝3p_{3}italic_p start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT and p4subscript𝑝4p_{4}italic_p start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT represent the probabilities of X choosing to cooperate given the previous round’s outcomes of XY=CCXY𝐶𝐶\textbf{XY}=CCXY = italic_C italic_C, XY=CDXY𝐶𝐷\textbf{XY}=CDXY = italic_C italic_D, XY=DCXY𝐷𝐶\textbf{XY}=DCXY = italic_D italic_C and XY=DDXY𝐷𝐷\textbf{XY}=DDXY = italic_D italic_D respectively. Similarly, the strategy of Y is denoted as q=(q1,q2,q3,q4)qsubscript𝑞1subscript𝑞2subscript𝑞3subscript𝑞4\textbf{q}=(q_{1},q_{2},q_{3},q_{4})q = ( italic_q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_q start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT , italic_q start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT ), where qnsubscript𝑞𝑛q_{n}italic_q start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT represent the probabilities of Y choosing to cooperate given the previous round’s outcomes of XY=CCXY𝐶𝐶\textbf{XY}=CCXY = italic_C italic_C, XY=DCXY𝐷𝐶\textbf{XY}=DCXY = italic_D italic_C, XY=CDXY𝐶𝐷\textbf{XY}=CDXY = italic_C italic_D and XY=DDXY𝐷𝐷\textbf{XY}=DDXY = italic_D italic_D respectively.

2.3 Asymmetric Element

The asymmetry in this study is reflected in the concept of ”wealth value”. Wealth value integrates factors such as reputation, status and capital, leading to varying returns for individuals in the game. In this study, the wealth value k𝑘kitalic_k ranges from (0,10)010(0,10)( 0 , 10 ) to reflect the differences between the individuals[30].

Assuming the total number of individuals in the group is N𝑁Nitalic_N, and N𝑁Nitalic_N random numbers within the range (0,10)010(0,10)( 0 , 10 ) are generated and assigned to each individual as their initial wealth value before the first round.

From Table 1, the basic payoff matrix under symmetric games can be derived as follows.

A=[RSTP]=[bccb0]=[3140]𝐴matrix𝑅𝑆𝑇𝑃matrix𝑏𝑐𝑐𝑏0matrix3140A=\begin{bmatrix}R&S\\ T&P\end{bmatrix}=\begin{bmatrix}b-c&-c\\ b&0\end{bmatrix}=\begin{bmatrix}3&-1\\ 4&0\end{bmatrix}italic_A = [ start_ARG start_ROW start_CELL italic_R end_CELL start_CELL italic_S end_CELL end_ROW start_ROW start_CELL italic_T end_CELL start_CELL italic_P end_CELL end_ROW end_ARG ] = [ start_ARG start_ROW start_CELL italic_b - italic_c end_CELL start_CELL - italic_c end_CELL end_ROW start_ROW start_CELL italic_b end_CELL start_CELL 0 end_CELL end_ROW end_ARG ] = [ start_ARG start_ROW start_CELL 3 end_CELL start_CELL - 1 end_CELL end_ROW start_ROW start_CELL 4 end_CELL start_CELL 0 end_CELL end_ROW end_ARG ] (1)

For X and Y, with wealth values k1subscript𝑘1k_{1}italic_k start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and k2subscript𝑘2k_{2}italic_k start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT respectively, the payoff matrix is defined as follows.

AX=[RSTP]=[k1(bc)k1ck1b0]=[3k1k14k10]subscript𝐴Xmatrix𝑅𝑆𝑇𝑃matrixsubscript𝑘1𝑏𝑐subscript𝑘1𝑐subscript𝑘1𝑏0matrix3subscript𝑘1subscript𝑘14subscript𝑘10A_{\textbf{X}}=\begin{bmatrix}R&S\\ T&P\end{bmatrix}=\begin{bmatrix}k_{1}(b-c)&-k_{1}c\\ k_{1}b&0\end{bmatrix}=\begin{bmatrix}3k_{1}&-k_{1}\\ 4k_{1}&0\end{bmatrix}italic_A start_POSTSUBSCRIPT X end_POSTSUBSCRIPT = [ start_ARG start_ROW start_CELL italic_R end_CELL start_CELL italic_S end_CELL end_ROW start_ROW start_CELL italic_T end_CELL start_CELL italic_P end_CELL end_ROW end_ARG ] = [ start_ARG start_ROW start_CELL italic_k start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_b - italic_c ) end_CELL start_CELL - italic_k start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_c end_CELL end_ROW start_ROW start_CELL italic_k start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_b end_CELL start_CELL 0 end_CELL end_ROW end_ARG ] = [ start_ARG start_ROW start_CELL 3 italic_k start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL start_CELL - italic_k start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL 4 italic_k start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL start_CELL 0 end_CELL end_ROW end_ARG ] (2)
AY=[RSTP]=[k2(bc)k2ck2b0]=[3k2k24k20]subscript𝐴Ymatrix𝑅𝑆𝑇𝑃matrixsubscript𝑘2𝑏𝑐subscript𝑘2𝑐subscript𝑘2𝑏0matrix3subscript𝑘2subscript𝑘24subscript𝑘20A_{\textbf{Y}}=\begin{bmatrix}R&S\\ T&P\end{bmatrix}=\begin{bmatrix}k_{2}(b-c)&-k_{2}c\\ k_{2}b&0\end{bmatrix}=\begin{bmatrix}3k_{2}&-k_{2}\\ 4k_{2}&0\end{bmatrix}italic_A start_POSTSUBSCRIPT Y end_POSTSUBSCRIPT = [ start_ARG start_ROW start_CELL italic_R end_CELL start_CELL italic_S end_CELL end_ROW start_ROW start_CELL italic_T end_CELL start_CELL italic_P end_CELL end_ROW end_ARG ] = [ start_ARG start_ROW start_CELL italic_k start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_b - italic_c ) end_CELL start_CELL - italic_k start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_c end_CELL end_ROW start_ROW start_CELL italic_k start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_b end_CELL start_CELL 0 end_CELL end_ROW end_ARG ] = [ start_ARG start_ROW start_CELL 3 italic_k start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL start_CELL - italic_k start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL 4 italic_k start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL start_CELL 0 end_CELL end_ROW end_ARG ] (3)

This settings integrate the impact of wealth value into the payoff matrix. It can be understood as follows: due to the differing identities, statuses, and assets of the two participants in the game, they can only obtain returns that match their positions. For instance, in a special PD, two prisoners have committed a crime together, but their sentences differ due to their different roles in the crime. Suppose prisoner X’s crime is more severe, leading to a longer sentence, while prisoner Y’s crime is less severe, resulting in a shorter sentence. If they both cooperate, they will achieve outcomes proportional to their sentences: 2 years and 1 year respectively. If X cooperates and Y defects, Y will be immediate released, while X will stay in the prison for 10 years. If on contrast, X will get immediate release, while Y will receive 8 years. If they both defect, they will receive relatively bad outcomes: 5 years and 3 years respectively.

2.4 Payoff Calculation

In a symmetric game, where the wealth values of X and Y are both 1, the payoff vector for X is defined as RX=(3,1,4,0)subscript𝑅𝑋3140R_{X}=(3,-1,4,0)italic_R start_POSTSUBSCRIPT italic_X end_POSTSUBSCRIPT = ( 3 , - 1 , 4 , 0 ), and correspondingly, the payoff vector for Y is defined as RY=(3,4,���1,0)subscript𝑅𝑌3410R_{Y}=(3,4,-1,0)italic_R start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT = ( 3 , 4 , - 1 , 0 ). For the asymmetric game, due to changes in the payoff matrix, the payoff vectors for both players become RX=(3k1,k1,4k1,0)subscript𝑅X3subscript𝑘1subscript𝑘14subscript𝑘10R_{\textbf{X}}=(3k_{1},-k_{1},4k_{1},0)italic_R start_POSTSUBSCRIPT X end_POSTSUBSCRIPT = ( 3 italic_k start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , - italic_k start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , 4 italic_k start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , 0 ) and RY=(3k2,4k2,k2,0)subscript𝑅Y3subscript𝑘24subscript𝑘2subscript𝑘20R_{\textbf{Y}}=(3k_{2},4k_{2},-k_{2},0)italic_R start_POSTSUBSCRIPT Y end_POSTSUBSCRIPT = ( 3 italic_k start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , 4 italic_k start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , - italic_k start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , 0 ) respectively. For both scenarios, after a single interaction, the expected payoffs for X and Y can be calculated using the following formula.

rX=μ𝐑Xμ1=D(p,q,RX)D(p,q,1)subscript𝑟X𝜇subscript𝐑𝑋𝜇1𝐷𝑝𝑞subscript𝑅X𝐷𝑝𝑞1\displaystyle r_{\textbf{X}}=\frac{\mu\cdot\mathbf{R}_{X}}{\mu\cdot 1}=\frac{D% (p,q,R_{\textbf{X}})}{D(p,q,1)}italic_r start_POSTSUBSCRIPT X end_POSTSUBSCRIPT = divide start_ARG italic_μ ⋅ bold_R start_POSTSUBSCRIPT italic_X end_POSTSUBSCRIPT end_ARG start_ARG italic_μ ⋅ 1 end_ARG = divide start_ARG italic_D ( italic_p , italic_q , italic_R start_POSTSUBSCRIPT X end_POSTSUBSCRIPT ) end_ARG start_ARG italic_D ( italic_p , italic_q , 1 ) end_ARG (4)
rY=μ𝐑Yμ1=D(p,q,RY)D(p,q,1)subscript𝑟Y𝜇subscript𝐑𝑌𝜇1𝐷𝑝𝑞subscript𝑅Y𝐷𝑝𝑞1\displaystyle r_{\textbf{Y}}=\frac{\mu\cdot\mathbf{R}_{Y}}{\mu\cdot 1}=\frac{D% (p,q,R_{\textbf{Y}})}{D(p,q,1)}italic_r start_POSTSUBSCRIPT Y end_POSTSUBSCRIPT = divide start_ARG italic_μ ⋅ bold_R start_POSTSUBSCRIPT italic_Y end_POSTSUBSCRIPT end_ARG start_ARG italic_μ ⋅ 1 end_ARG = divide start_ARG italic_D ( italic_p , italic_q , italic_R start_POSTSUBSCRIPT Y end_POSTSUBSCRIPT ) end_ARG start_ARG italic_D ( italic_p , italic_q , 1 ) end_ARG

Among them, μ𝜇\muitalic_μ is the stationary vector of matrices p𝑝pitalic_p and q𝑞qitalic_q. In addition,

μh𝜇\displaystyle\mu\cdot hitalic_μ ⋅ italic_h =D(p,q,h)=|1+p1q11+p11+q1h1p2q31+p2q3h2p3q2p31+q2h3p4q4p4q4h4|absent𝐷𝑝𝑞1subscript𝑝1subscript𝑞11subscript𝑝11subscript𝑞1subscript1subscript𝑝2subscript𝑞31subscript𝑝2subscript𝑞3subscript2subscript𝑝3subscript𝑞2subscript𝑝31subscript𝑞2subscript3subscript𝑝4subscript𝑞4subscript𝑝4subscript𝑞4subscript4\displaystyle=D(p,q,h)=\left|\begin{array}[]{cccc}-1+p_{1}q_{1}&-1+p_{1}&-1+q_% {1}&h_{1}\\ p_{2}q_{3}&-1+p_{2}&q_{3}&h_{2}\\ p_{3}q_{2}&p_{3}&-1+q_{2}&h_{3}\\ p_{4}q_{4}&p_{4}&q_{4}&h_{4}\end{array}\right|= italic_D ( italic_p , italic_q , italic_h ) = | start_ARRAY start_ROW start_CELL - 1 + italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL start_CELL - 1 + italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL start_CELL - 1 + italic_q start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL start_CELL italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL italic_p start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_q start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_CELL start_CELL - 1 + italic_p start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL start_CELL italic_q start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_CELL start_CELL italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL italic_p start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT italic_q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL start_CELL italic_p start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_CELL start_CELL - 1 + italic_q start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL start_CELL italic_h start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL italic_p start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT italic_q start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT end_CELL start_CELL italic_p start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT end_CELL start_CELL italic_q start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT end_CELL start_CELL italic_h start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT end_CELL end_ROW end_ARRAY | (5)
𝐡𝐡\displaystyle\mathbf{h}bold_h =[h1h2h3h4]absentmatrixsubscript1subscript2subscript3subscript4\displaystyle=\begin{bmatrix}h_{1}\\ h_{2}\\ h_{3}\\ h_{4}\end{bmatrix}= [ start_ARG start_ROW start_CELL italic_h start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL italic_h start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL italic_h start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL italic_h start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT end_CELL end_ROW end_ARG ]

In this paper, the expected payoff for each node in the network is defined as the average payoffs obtained by itself from interactions with all its neighbors.

2.5 Strategy Updating Method

In complex networks, the actions of individuals can be divided into interaction and updating. During the simulation process, interaction involves each individual playing a two-person asymmetric PD game with all its neighbors and obtaining the corresponding payoff. Updating occurs after each individual completes a round of games: each individual randomly selects one of its neighbors to compare payoffs and decide whether to update their strategy.

For the basis of deciding whether to update the strategy, we choose to use the Fermi function in this paper. For X, whose neighbor set is P, X obtains an average payoff rXsubscript𝑟Xr_{\textbf{X}}italic_r start_POSTSUBSCRIPT X end_POSTSUBSCRIPT from the recent round with all players in P. At this point, X randomly selects a player Y from P, who has obtained an average payoff rYsubscript𝑟Yr_{\textbf{Y}}italic_r start_POSTSUBSCRIPT Y end_POSTSUBSCRIPT in the same round. According to the Fermi dynamics, rXsubscript𝑟Xr_{\textbf{X}}italic_r start_POSTSUBSCRIPT X end_POSTSUBSCRIPT will adopt rYsubscript𝑟Yr_{\textbf{Y}}italic_r start_POSTSUBSCRIPT Y end_POSTSUBSCRIPT’s strategy in the next round with a probability given by w𝑤witalic_w or will continue using its current strategy with a probability of 1w1𝑤1-w1 - italic_w[31, 32].

w=11+exprXrYk𝑤11subscript𝑟Xsubscript𝑟Y𝑘w=\frac{1}{1+\exp{\frac{r_{\textbf{X}}-r_{\textbf{Y}}}{k}}}italic_w = divide start_ARG 1 end_ARG start_ARG 1 + roman_exp divide start_ARG italic_r start_POSTSUBSCRIPT X end_POSTSUBSCRIPT - italic_r start_POSTSUBSCRIPT Y end_POSTSUBSCRIPT end_ARG start_ARG italic_k end_ARG end_ARG (6)

In the denominator of the formula, k𝑘kitalic_k represents the rationality level of individuals in the network. As k𝑘kitalic_k approaches infinity, individuals gradually tend to make completely random choices regarding whether to update their strategy. Conversely, as k𝑘kitalic_k approaches zero, individuals become fully rational, meaning they will adopt the other individual’s strategy as long as the other’s expected payoff is higher than their own.

According to previous research, in symmetric games, k𝑘kitalic_k is often set to 1. However, this value cannot be directly applied to asymmetric games. For example, consider a game between X with huge wealth and Y with relatively low wealth. Because X has a much larger principal, he can obtain significantly higher payoffs compared to Y. However, Y should not easily adopt X’s strategy, because with his relatively smaller principal, adopting X’s strategy will not lead to a significant increase in his payoff. In this paper, the parameter k𝑘kitalic_k is set to 8 to match the outcomes of symmetric games where k=1𝑘1k=1italic_k = 1.

Refer to caption
Figure 1: (a). Distribution Map of Conversion Probability on Symmetric Network. (b). Distribution Map of Conversion Probability on Asymmetric Network.

2.6 Wealth Updating Method

In this paper, all wealth values are defined within the interval (0,10)010(0,10)( 0 , 10 ). Initially, each ”participant” in the network randomly receives a wealth value within this range. During one round, participants receive an average payoff, which depends on their original wealth value and strategy choice. Given the bounded interval for wealth values, the average payoff will also fall within a specific range. After this single round, the original wealth of all participants and their average payoff from that round are summed. This total is then normalized to the interval (0,10)010(0,10)( 0 , 10 ) to ensure a unified standard for wealth values, preventing any strong individual’s wealth from growing excessively and causing the strategy set to converge too quickly. Additionally, it is important to emphasize that after each round of wealth updates, the payoff vector R=(3k,k,4k,0)𝑅3𝑘𝑘4𝑘0R=(3k,-k,4k,0)italic_R = ( 3 italic_k , - italic_k , 4 italic_k , 0 ) of each individual will also change accordingly. This means the strategy dynamics are continually influenced by the updated wealth and payoff values, maintaining a dynamic and adaptive system throughout the simulation process.

3 Results

This chapter is divided into three sections. The first section provides a brief classification and discussion of the strategy domain. The second section elaborates on the evolutionary game of asymmetric Prisoner’s Dilemma on BA scale-free network and analyzes the result. The third section conducts supplementary experiments on the evolutionary outcomes.

3.1 Classification and Discussion of Strategy Domain

3.1.1 Analysis of Win Rate Curves at Different Cooperation Levels

An analysis is conducted to understand the impact of each component of S𝑆Sitalic_S on the payoff (win rate) against random strategies. To perform it, for each p[0,1p\in[0,1italic_p ∈ [ 0 , 1 in S=(p1,p2,p3,p4)𝑆subscript𝑝1subscript𝑝2subscript𝑝3subscript𝑝4S=(p_{1},p_{2},p_{3},p_{4})italic_S = ( italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_p start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_p start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT , italic_p start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT ), we choose p=0.2𝑝0.2p=0.2italic_p = 0.2, p=0.5𝑝0.5p=0.5italic_p = 0.5 and p=0.8𝑝0.8p=0.8italic_p = 0.8 to represent ”low cooperation willingness”, ”medium cooperation willingness” and ”high cooperation willingness” respectively. p1subscript𝑝1p_{1}italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, p2subscript𝑝2p_{2}italic_p start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, p3subscript𝑝3p_{3}italic_p start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT and p4subscript𝑝4p_{4}italic_p start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT are controlled separately for simulations and statistical classification. 10,000 random strategies are generated to calculate the win rate of S𝑆Sitalic_S against these random strategies. Partial experimental results are presented below, and others are presented appendix.

Refer to caption

Figure 2: p1subscript𝑝1p_{1}italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT’s Impact on Win Rate against Random Strategies When p2=0.2subscript𝑝20.2p_{2}=0.2italic_p start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 0.2, p2=0.5subscript𝑝20.5p_{2}=0.5italic_p start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 0.5 and p2=0.8subscript𝑝20.8p_{2}=0.8italic_p start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 0.8 Respectively

In Figure 2, the curves indicates that the performance of strategy S𝑆Sitalic_S against random strategies is negatively impacted by increases in the values of parameters p1subscript𝑝1p_{1}italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, p2subscript𝑝2p_{2}italic_p start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, p3subscript𝑝3p_{3}italic_p start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT and p4subscript𝑝4p_{4}italic_p start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT.

Specifically, the trends shown by the solid, dashed, and dotted curves of the same color suggest that increase in parameter p4subscript𝑝4p_{4}italic_p start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT has a detrimental effect on the win rate. Similarly, the trends shown by the same line types in different colors indicate that increase in parameter p3subscript𝑝3p_{3}italic_p start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT performs the same. Furthermore, the overall downward trends observed across the three figures as the parameter p1subscript𝑝1p_{1}italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT increase, as well as the progressive reductions in the values of the corresponding curves, demonstrate that growth in both parameters p1subscript𝑝1p_{1}italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and p2subscript𝑝2p_{2}italic_p start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT adversely affect the win rate.

3.1.2 Comparison of Each p𝑝pitalic_p

To analyze the relative impact of the four components on win rates, heatmaps were generated under different combinations of these components. For instance, when comparing the relative effects of p1subscript𝑝1p_{1}italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and p1subscript𝑝1p_{1}italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, fixed values were assigned to p3subscript𝑝3p_{3}italic_p start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT and p4subscript𝑝4p_{4}italic_p start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT. As p1subscript𝑝1p_{1}italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and p2subscript𝑝2p_{2}italic_p start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT varied from low to high, 1000 random strategies were generated at each sampling point, and the win rate against these random strategies was computed to create a heatmap of the distribution. Below is a selection of the experimental results, with the remaining results displayed in appendix.

Refer to caption


Figure 3: Impact of p1subscript𝑝1p_{1}italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and p2subscript𝑝2p_{2}italic_p start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT on Win Rate When p3subscript𝑝3p_{3}italic_p start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT and p4subscript𝑝4p_{4}italic_p start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT Diverse

Figure 3 illustrates the impact of p1subscript𝑝1p_{1}italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and p2subscript𝑝2p_{2}italic_p start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT on the win rate of S=(p1,p2,p3,p4)𝑆subscript𝑝1subscript𝑝2subscript𝑝3subscript𝑝4S=(p_{1},p_{2},p_{3},p_{4})italic_S = ( italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_p start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_p start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT , italic_p start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT ) against random strategies when p3subscript𝑝3p_{3}italic_p start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT and p4subscript𝑝4p_{4}italic_p start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT are set to 0.2, 0.5 and 0.8 respectively. The horizontal and vertical axes have consistent meanings, and the two distinct color spots in the figure represent win rates (W𝑊Witalic_W) satisfying 0.495W0.5050.495𝑊0.5050.495\leq W\leq 0.5050.495 ≤ italic_W ≤ 0.505 and 0.745W0.7550.745𝑊0.7550.745\leq W\leq 0.7550.745 ≤ italic_W ≤ 0.755 (with only the former appearing in Figure (i)). Taking Figure (a) as an example, both p3subscript𝑝3p_{3}italic_p start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT and p4subscript𝑝4p_{4}italic_p start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT are at relatively low levels (corresponding to the practical scenario where the current round’s probability of cooperation is low after the previous round’s defection by the player who uses strategy S𝑆Sitalic_S). To improve the player’s win rate, p1subscript𝑝1p_{1}italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and p2subscript𝑝2p_{2}italic_p start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT should be maintained at low levels, which is consistent with the preliminary conclusions obtained earlier. Furthermore, maintaining p2subscript𝑝2p_{2}italic_p start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT at a low level is more conducive to achieving better results than reducing p1subscript𝑝1p_{1}italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, indicating that ”reducing p1subscript𝑝1p_{1}italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT” is more beneficial for victory compared to ”reducing p2subscript𝑝2p_{2}italic_p start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT”. Given the practical significance of p1subscript𝑝1p_{1}italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and p2subscript𝑝2p_{2}italic_p start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, it can be inferred that ”greedily” defecting can reap greater benefits when both sides cooperated in the previous round, while showing some tolerance when the S-user cooperated, and the opponent defected in the previous round, might also yield favorable outcomes.

Refer to caption


Figure 4: Impact of p3subscript𝑝3p_{3}italic_p start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT and p4subscript𝑝4p_{4}italic_p start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT on Win Rate When p1subscript𝑝1p_{1}italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and p2subscript𝑝2p_{2}italic_p start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT Diverse

Figure 4 illustrates the impact of p3subscript𝑝3p_{3}italic_p start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT and p4subscript𝑝4p_{4}italic_p start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT on the win rate of S=(p1,p2,p3,p4)𝑆subscript𝑝1subscript𝑝2subscript𝑝3subscript𝑝4S=(p_{1},p_{2},p_{3},p_{4})italic_S = ( italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_p start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_p start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT , italic_p start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT ) against random strategies when p1subscript𝑝1p_{1}italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and p2subscript𝑝2p_{2}italic_p start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT are set to 0.2, 0.5 and 0.8 respectively. In this example, p1subscript𝑝1p_{1}italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT is at a high level and p2subscript𝑝2p_{2}italic_p start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT is at a moderate level. This corresponds to a situation where the player has a high probability of cooperating if both players cooperated in the previous round, and a moderate probability of cooperating if the player cooperated but the opponent defected in the previous round. The results show that to achieve a high win rate, p3subscript𝑝3p_{3}italic_p start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT and p4subscript𝑝4p_{4}italic_p start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT should be maintained at relatively low levels. Specifically, keeping p3subscript𝑝3p_{3}italic_p start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT at a low level is more beneficial for outperforming the random strategy than keeping p4subscript𝑝4p_{4}italic_p start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT at a low level. This suggests that when both players defected in the previous round, it may be advantageous to ”reconcile” to some degree, rather than continuing to defect. Conversely, when the player cooperated but the opponent defected in the previous round, the player should consider continuing to ”exploit” the opponent’s goodwill, as this can lead to more favorable outcomes.

Our research on the four-parameter set reveals that the parameters have varying degrees of impact on the win rate, with p1subscript𝑝1p_{1}italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT being the most influential, followed by p2subscript𝑝2p_{2}italic_p start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, p3subscript𝑝3p_{3}italic_p start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT and p4subscript𝑝4p_{4}italic_p start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT. This suggests that if the strategy S=(p1,p2,p3,p4)𝑆subscript𝑝1subscript𝑝2subscript𝑝3subscript𝑝4S=(p_{1},p_{2},p_{3},p_{4})italic_S = ( italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_p start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_p start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT , italic_p start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT ) has to focus on improving one parameter, it would be most beneficial to prioritize keeping p1subscript𝑝1p_{1}italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT at a relatively low level, while considering a moderate increase in p4subscript𝑝4p_{4}italic_p start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT. Interpreting this in practical terms, when facing the outcome of mutual cooperation in the previous round, it would be the optimal choice for strategy S𝑆Sitalic_S to lean towards defection. Similarly, when confronting the outcome of mutual defection in the previous round, continuing to defect would certainly be the best option. However, moderately increasing the probability of cooperation in this scenario can help maintain one’s own payoff while appearing less purely self-interested.

3.2 Asymmetric IPD on BA Scale-Free Network

3.2.1 Experimental Settings and Steps

The BA scale-free network model is a dynamic network model commonly used to generate scale-free networks. It simulates the ”rich-get-richer” phenomenon observed in social networks, where nodes with higher degrees are more likely to attract new connections[33]. In our experiment, the network is defined with a final node count of n=1000𝑛1000n=1000italic_n = 1000, and each newly added node has an initial degree m=20𝑚20m=20italic_m = 20. This results in a total of e=19600𝑒19600e=19600italic_e = 19600 edges.

In the experiment, each node in the network represents an individual with an initial state characterized by a randomly assigned memory-one strategy. These individuals engage in interactions with all their neighbors in each round of the game, generating payoffs based on these interactions. After each round, individuals update their strategies based on the payoffs they and their neighbors received. This iterative process continues until the network reaches an equilibrium state. An equilibrium state is defined as either a state where only one strategy remains across the network, or a dynamic equilibrium where, after 2000 rounds of games, several (usually no more than three) strategies persist.

Refer to caption
Figure 5: Experimental Flowchart

3.2.2 Experimental Results

Following the aforementioned method, 1000 repeated experiments are conducted, resulting in over 1200 distinct dominant strategies. The strategies were categorized and analyzed using clustering algorithms.

Due to the narrow distribution of the strategy set within the interval p[0,1]𝑝01p\in[0,1]italic_p ∈ [ 0 , 1 ], clustering algorithms such as DBSCAN, which require specifying point spacing and neighborhood size, presented significant challenges. Therefore, we utilized the K-Means algorithm, supported by the elbow method and silhouette coefficient and discover that the appropriate number of clusters is 6. Table 2 presents the coordinates of the center points and the number of individuals in each cluster.

Number Center Points of Clusters Cluster Sizes
1 [0.2761,0.6548,0.1609,0.2059]0.27610.65480.16090.2059[0.2761,0.6548,0.1609,0.2059][ 0.2761 , 0.6548 , 0.1609 , 0.2059 ] 229
2 [0.7709,0.3186,0.1537,0.1782]0.77090.31860.15370.1782[0.7709,0.3186,0.1537,0.1782][ 0.7709 , 0.3186 , 0.1537 , 0.1782 ] 256
3 [0.1964,0.3716,0.6072,0.6647]0.19640.37160.60720.6647[0.1964,0.3716,0.6072,0.6647][ 0.1964 , 0.3716 , 0.6072 , 0.6647 ] 170
4 [0.7031,0.3372,0.7134,0.3185]0.70310.33720.71340.3185[0.7031,0.3372,0.7134,0.3185][ 0.7031 , 0.3372 , 0.7134 , 0.3185 ] 155
5 [0.6894,0.3174,0.1914,0.7038]0.68940.31740.19140.7038[0.6894,0.3174,0.1914,0.7038][ 0.6894 , 0.3174 , 0.1914 , 0.7038 ] 167
6 [0.2365,0.1649,0.1814,0.2794]0.23650.16490.18140.2794[0.2365,0.1649,0.1814,0.2794][ 0.2365 , 0.1649 , 0.1814 , 0.2794 ] 284
Table 2: Clusters’ Information

To further investigate the characteristics of each cluster, we recorded the payoffs when clusters confronted each other, as well as their win rates and average payoffs against 10,000 random strategies. The results are shown in Table 3.

S1subscript𝑆1S_{1}italic_S start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT S2subscript𝑆2S_{2}italic_S start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT S3subscript𝑆3S_{3}italic_S start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT S4subscript𝑆4S_{4}italic_S start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT S5subscript𝑆5S_{5}italic_S start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT S6subscript𝑆6S_{6}italic_S start_POSTSUBSCRIPT 6 end_POSTSUBSCRIPT
S1subscript𝑆1S_{1}italic_S start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT 0.9122 0.6274 1.7089 1.4411 1.6576 0.6205
S2subscript𝑆2S_{2}italic_S start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT 1.0206 0.7428 1.7172 1.4502 1.7318 0.7242
S3subscript𝑆3S_{3}italic_S start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT 0.5270 0.5229 1.4123 1.5617 1.4366 0.4060
S4subscript𝑆4S_{4}italic_S start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT 0.6135 0.6807 1.3813 1.5878 1.3996 0.4989
S5subscript𝑆5S_{5}italic_S start_POSTSUBSCRIPT 5 end_POSTSUBSCRIPT 0.5841 0.6079 1.4118 1.5600 1.5476 0.3979
S6subscript𝑆6S_{6}italic_S start_POSTSUBSCRIPT 6 end_POSTSUBSCRIPT 1.0201 0.6828 1.7492 1.4266 1.7859 0.7163
Table 3: The Game Results of Each Cluster and the Win Rate and Average Payoff Facing Random Strategies

The table entries indicate the payoff obtained by the horizontal strategy when facing the vertical strategy. For example, the value 0.6274 in the cell corresponding to S1S2subscript𝑆1subscript𝑆2S_{1}-S_{2}italic_S start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - italic_S start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT indicates that strategy S1subscript𝑆1S_{1}italic_S start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT gains a payoff of 0.6274 when confronting S2subscript𝑆2S_{2}italic_S start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT.

The red entries show the payoffs of strategy against all other strategies, highlighting that S2subscript𝑆2S_{2}italic_S start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT consistently achieves higher payoffs compared to its opponents, regardless of what strategy they choose. Additionally, S2subscript𝑆2S_{2}italic_S start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT’s self-play payoff is lower than that of other strategies’ self-play payoffs. This indicates that S2subscript𝑆2S_{2}italic_S start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT displays a significant advantage against random strategies, demonstrating a ”self-bad, partner-worse” outcome in the asymmetric prisoner’s dilemma, suggesting that such strategies can emerge and maintain a certain scale[34].

The green-background entries represent the results of strategy S4subscript𝑆4S_{4}italic_S start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT against all other strategies. It is observed that S4subscript𝑆4S_{4}italic_S start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT always obtains a lower payoff compared to its opponents, who achieve relatively high payoffs. Moreover, S4subscript𝑆4S_{4}italic_S start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT’s self-play payoff is higher than that of the other strategies’ self-play payoffs, indicating its inclination towards seeking cooperation and ensuring better outcomes for both parties. In real life, this strategy corresponds to the ”altruists” who prioritize the overall good. Furthermore, S4subscript𝑆4S_{4}italic_S start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT does not exhibit an advantage against random strategies and has the smallest number of individuals among the six strategy clusters, aligning with logical reasoning and common sense.

3.3 The Evolution and Spread of Cooperative Strategies on Network

In the previous section, we observed that in the evolutionary game of the asymmetric prisoner’s dilemma on a BA scale-free network, the S4subscript𝑆4S_{4}italic_S start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT strategy might ultimately evolve. In real life, we always hope that groups tend towards cooperation. For example, parents teach their children in kindergarten to become good friends with their peers rather than encouraging hostility towards them. Similarly, in international affairs, powerful nations have always sought friendly exchanges with other nations to promote cooperation and mutual development. This part of the experiment aims to find a method to foster cooperation.

A crucial question is how to define the manifestation of enhanced cooperation. We propose the following research method: if the level of cooperation increases, the fitness of the population will improve, corresponding to an increase in the average payoff of the group in this experiment[35]. Based on this idea, we conducted two supplementary experiments.

In this section, we employ a BA scale-free network model with 100 nodes and an initial degree of 4 for each newly added node. Each node also has an initial ”wealth value.” Based on the experimental procedures described in the second section of this chapter, we made the following two modifications.

(1) Initial Entry of Strategy S4subscript𝑆4S_{4}italic_S start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT

Before starting the experiment, we introduced strategy S4subscript𝑆4S_{4}italic_S start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT to the initial network according to the following rules.

Random Selection: Randomly select several nodes at the initial stage and assign them Strategy S4subscript𝑆4S_{4}italic_S start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT.

Degree-Based Selection: Select several nodes at the initial stage based on their degree from high to low and assign them Strategy S4subscript𝑆4S_{4}italic_S start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT.

Wealth-Based Selection: Select several nodes at the initial stage based on their initial wealth value from high to low and assign them Strategy S4subscript𝑆4S_{4}italic_S start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT.

These operations aim to spread Strategy S4subscript𝑆4S_{4}italic_S start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT by leveraging the strategies of important nodes.

(2) Eliminating Low-Cooperation Strategies

During the strategy update phase after each round of the game, if a component p𝑝pitalic_p of a node’s strategy S=(p1,p2,p3,p4)𝑆subscript𝑝1subscript𝑝2subscript𝑝3subscript𝑝4S=(p_{1},p_{2},p_{3},p_{4})italic_S = ( italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_p start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_p start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT , italic_p start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT ) is lower than a certain threshold (indicating a very low level of cooperation), there is a certain probability that a neighboring node will be randomly selected (where all four components of the neighboring node’s strategy S=(p1,p2,p3,p4)superscript𝑆superscriptsubscript𝑝1superscriptsubscript𝑝2superscriptsubscript𝑝3superscriptsubscript𝑝4S^{\prime}=(p_{1}^{\prime},p_{2}^{\prime},p_{3}^{\prime},p_{4}^{\prime})italic_S start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = ( italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_p start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_p start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_p start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) must be greater than this threshold) to adopt its strategy in the next round.

Each type of experiment described above was repeated 100 times, generating 100 dominant strategies that evolved. Each of these strategies was then subjected to 1000 tests against random strategies, and the average payoff of the random strategies was recorded. A distribution curve of these 100 average payoffs was plotted.

The understanding is that the initial entry of Strategy S4subscript𝑆4S_{4}italic_S start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT may influence the evolution of strategies within the network. If the influence is positive, the evolved dominant strategies should be able to promote cooperation within the group. And promoting group cooperation, in turn, is partly reflected in the increased fitness of the group when facing a random population, manifested as an increase in average payoff. The experiment yielded the following results.

Refer to caption

Figure 6: The Impact of the Evolution Results on the Fitness of the Random Population after Adding Strategy S4subscript𝑆4S_{4}italic_S start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT in the First Round According to Different Rules

From Figure 6, it can be observed that the evolution results of randomly introducing 10 strategies S4subscript𝑆4S_{4}italic_S start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT into the initial network did not have a significant impact on the fitness of the population when facing a random population. Similarly, selectively introducing 10 nodes with the highest wealth values, 10 nodes with the highest degrees, or a combination of 5 nodes with the highest wealth values and 5 nodes with the highest degrees to adopt strategy S4subscript𝑆4S_{4}italic_S start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT in the initial network did not significantly affect the average fitness of the population. This only resulted in the fitness of individuals in the population being more centered around an intermediate level.

Refer to caption

Figure 7: The Impact of the Evolution Results on the Fitness of the Random Population after Adding Strategy S4subscript𝑆4S_{4}italic_S start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT in the First Round According to Different Rules and Adding Probabilistic Strategy Optimization Mechanism

From Figure 7, it can be seen that selectively introducing 10 nodes with the highest wealth values to adopt strategy S4subscript𝑆4S_{4}italic_S start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT in the first round, combined with a 50% probability of resetting low-cooperation strategies, resulted in a better distribution of population fitness. On one hand, this approach led to more individuals having intermediate fitness levels within the population. On the other hand, it also resulted in a certain number of high-fitness individuals. Furthermore, the range of the horizontal axis indicates that the mechanism of resetting low-cooperation strategies with a 50% probability effectively eliminated individuals with negative payoffs in the random population. This suggests that the strategy reset mechanism can significantly enhance cooperation in the evolutionary outcome, thereby promoting cooperation during the evolution process.

4 Conclusion and Discussion

In the experiments above, the strategy domain of the framework used was classified and discussed, with a qualitative analysis of the impact of the four components of strategy S=(p1,p2,p3,p4)𝑆subscript𝑝1subscript𝑝2subscript𝑝3subscript𝑝4S=(p_{1},p_{2},p_{3},p_{4})italic_S = ( italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_p start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_p start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT , italic_p start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT ) on the win rate of it against random strategies. To increase its win rate, the S-user should maintain a low level of cooperation. The experiments also compared the relative effects of the four components. In the study of (p1,p2)subscript𝑝1subscript𝑝2(p_{1},p_{2})( italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_p start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ), it was found that ”reducing p1subscript𝑝1p_{1}italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT” is more conducive to victory compared to ”reducing p2subscript𝑝2p_{2}italic_p start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT”. Specifically, in scenarios where both parties chose to cooperate in the previous round, a very low cooperation rate is optimal for the current round. Conversely, when the individual cooperated and the opponent defected in the previous round, a certain level of tolerance can help avoid mutually detrimental outcomes. In the study of (p3,p4)subscript𝑝3subscript𝑝4(p_{3},p_{4})( italic_p start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT , italic_p start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT ), it was found that ”reducing p3subscript𝑝3p_{3}italic_p start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT” is more conducive to victory compared to ”reducing p4subscript𝑝4p_{4}italic_p start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT”.

In the strategy evolution experiments, the K-Means clustering algorithm identified six strategy clusters. Among these clusters, not only did a ”self-bad, partner-worse” strategy cluster emerge, but a ”altruists” strategy cluster also evolved. Similar to the zero-determinant strategies, the ”self-bad, partner-worse” strategy can control the opponent’s payoff to be lower than its own. The ”altruists” strategy however, is at a disadvantage against other strategy clusters but achieves the highest payoff in self-play. The existence of this strategy is beneficial for the continuation and development of the group.

In the network evolution experiments involving the ”altruists” strategy cluster, it was observed that introducing the ”altruists” strategy cluster into the initial network according to different rules resulted in the evolved strategies generally placing the fitness of individuals in the random population at intermediate to high levels, with little impact on the average fitness. Additionally, after introducing a mechanism for eliminating low-cooperation strategies, the fitness of individuals in the population tended to be more centered around intermediate levels, and a certain number of high-fitness individuals also emerged. This mechanism overall enhanced the fitness of the population and proved to be a method for promoting cooperation.

The experiments provide a theoretical foundation for the evolutionary processes of social networks. Through the study of asymmetric prisoner’s dilemma on weighted network, we uncover the relationships among strategies in complex evolutionary game environments, offering a framework for individuals and organizations to deploy effective countermeasures in practical decision-making. These findings not only enrich evolutionary game theory but also provide new perspectives and strategies for understanding and promoting cooperative behavior in social systems, thereby opening new avenues for enhancing overall population fitness and sustainable development.

Appendix A Other Win Rate Curves at Different Cooperation Levels

Refer to caption

Figure 8: p2subscript𝑝2p_{2}italic_p start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT’s Impact on Win Rate against Random Strategies When p3=0.2subscript𝑝30.2p_{3}=0.2italic_p start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT = 0.2, p3=0.5subscript𝑝30.5p_{3}=0.5italic_p start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT = 0.5 and p3=0.8subscript𝑝30.8p_{3}=0.8italic_p start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT = 0.8 Respectively

Refer to caption

Figure 9: p3subscript𝑝3p_{3}italic_p start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT’s Impact on Win Rate against Random Strategies When p4=0.2subscript𝑝40.2p_{4}=0.2italic_p start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT = 0.2, p4=0.5subscript𝑝40.5p_{4}=0.5italic_p start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT = 0.5 and p4=0.8subscript𝑝40.8p_{4}=0.8italic_p start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT = 0.8 Respectively

Refer to caption

Figure 10: p4subscript𝑝4p_{4}italic_p start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT’s Impact on Win Rate against Random Strategies When p1=0.2subscript𝑝10.2p_{1}=0.2italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 0.2, p1=0.5subscript𝑝10.5p_{1}=0.5italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 0.5 and p1=0.8subscript𝑝10.8p_{1}=0.8italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 0.8 Respectively

Appendix B Other Comparison of p𝑝pitalic_p

Refer to caption


Figure 11: Impact of p1subscript𝑝1p_{1}italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and p3subscript𝑝3p_{3}italic_p start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT on Win Rate When p2subscript𝑝2p_{2}italic_p start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT and p4subscript𝑝4p_{4}italic_p start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT Diverse

Refer to caption


Figure 12: Impact of p2subscript𝑝2p_{2}italic_p start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT and p4subscript𝑝4p_{4}italic_p start_POSTSUBSCRIPT 4 end_POSTSUBSCRIPT on Win Rate When p1subscript𝑝1p_{1}italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and p3subscript𝑝3p_{3}italic_p start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT Diverse

References

  • [1] Robert Axelrod and Robert O Keohane. Achieving cooperation under anarchy: Strategies and institutions. World politics, 38(1):226–254, 1985.
  • [2] Haihui Cheng and Xinzhu Meng. Evolution of cooperation in multigame with environmental space and delay. Biosystems, 223:104801, 2023.
  • [3] Karl Sigmund. Introduction to evolutionary game theory. Evolutionary game dynamics, 69:1–26, 2011.
  • [4] Jörgen W Weibull. Evolutionary game theory. MIT press, 1997.
  • [5] Zhengwu Zhao and Chunyan Zhang. The mechanisms of labor division from the perspective of task urgency and game theory. Physica A: Statistical Mechanics and its Applications, 630:129284, 2023.
  • [6] Jonathan B King. Prisoner’s paradoxes. Journal of Business Ethics, 7:475–487, 1988.
  • [7] Jesus Gomez-Gardenes, Miguel Romance, Regino Criado, Daniele Vilone, and Angel Sánchez. Evolutionary games defined at the network mesoscale: The public goods game. Chaos: An Interdisciplinary Journal of Nonlinear Science, 21(1), 2011.
  • [8] Alexander J Stewart and Joshua B Plotkin. Extortion and cooperation in the prisoner’s dilemma. Proceedings of the National Academy of Sciences, 109(26):10134–10135, 2012.
  • [9] Robert Axelrod and William D Hamilton. The evolution of cooperation. science, 211(4489):1390–1396, 1981.
  • [10] Seung Ki Baek, Hyeong-Chai Jeong, Christian Hilbe, and Martin A Nowak. Comparing reactive and memory-one strategies of direct reciprocity. Scientific reports, 6(1):25676, 2016.
  • [11] Martin A Nowak and Karl Sigmund. Tit for tat in heterogeneous populations. Nature, 355(6357):250–253, 1992.
  • [12] Lee Alan Dugatkin and Michael Alfieri. Guppies and the tit for tat strategy: preference based on past interaction. Behavioral Ecology and Sociobiology, 28:243–246, 1991.
  • [13] Claus Wedekind and Manfred Milinski. Human cooperation in the simultaneous and the alternating prisoner’s dilemma: Pavlov versus generous tit-for-tat. Proceedings of the National Academy of Sciences, 93(7):2686–2689, 1996.
  • [14] Martin Nowak and Karl Sigmund. A strategy of win-stay, lose-shift that outperforms tit-for-tat in the prisoner’s dilemma game. Nature, 364(6432):56–58, 1993.
  • [15] Lorens A Imhof, Drew Fudenberg, and Martin A Nowak. Tit-for-tat or win-stay, lose-shift? Journal of theoretical biology, 247(3):574–580, 2007.
  • [16] William H Press and Freeman J Dyson. Iterated prisoner’s dilemma contains strategies that dominate any evolutionary opponent. Proceedings of the National Academy of Sciences, 109(26):10409–10413, 2012.
  • [17] Marialisa Scatà, Alessandro Di Stefano, Aurelio La Corte, Pietro Liò, Emanuele Catania, Ermanno Guardo, and Salvatore Pagano. Combining evolutionary game theory and network theory to analyze human cooperation patterns. Chaos, Solitons & Fractals, 91:17–24, 2016.
  • [18] Martin A Nowak and Robert M May. Evolutionary games and spatial chaos. nature, 359(6398):826–829, 1992.
  • [19] KM Ariful Kabir, Jun Tanimoto, and Zhen Wang. Influence of bolstering network reciprocity in the evolutionary spatial prisoner’s dilemma game: A perspective. The European Physical Journal B, 91:1–10, 2018.
  • [20] Wen-Bo Du, Xian-Bin Cao, and Mao-Bin Hu. The effect of asymmetric payoff mechanism on evolutionary networked prisoner’s dilemma game. Physica A: Statistical Mechanics and its Applications, 388(24):5005–5012, 2009.
  • [21] Jose A Cuesta, Carlos Gracia-Lázaro, Alfredo Ferrer, Yamir Moreno, and Angel Sánchez. Reputation drives cooperative behaviour and network formation in human groups. Scientific reports, 5(1):7843, 2015.
  • [22] Qing Jian, Xiaopeng Li, Juan Wang, and Chengyi Xia. Impact of reputation assortment on tag-mediated altruistic behaviors in the spatial lattice. Applied Mathematics and Computation, 396:125928, 2021.
  • [23] Yu-Zhong Chen, Zi-Gang Huang, Sheng-Jun Wang, Yan Zhang, and Ying-Hai Wang. Diversity of rationality affects the evolution of cooperation. Physical Review E—Statistical, Nonlinear, and Soft Matter Physics, 79(5):055101, 2009.
  • [24] Wenxing Ye and Suohai Fan. Evolutionary snowdrift game with rational selection based on radical evaluation. Applied Mathematics and Computation, 294:310–317, 2017.
  • [25] Nastaran Lotfi and Francisco A Rodrigues. On the effect of memory on the prisoner’s dilemma game in correlated networks. Physica A: Statistical Mechanics and its Applications, 607:128162, 2022.
  • [26] Zhipeng Zhang, Yu’e Wu, and Shuhua Zhang. Reputation-based asymmetric comparison of fitness promotes cooperation on complex networks. Physica A: Statistical Mechanics and its Applications, 608:128268, 2022.
  • [27] Christian Hilbe, Martin A Nowak, and Karl Sigmund. Evolution of extortion in iterated prisoner’s dilemma games. Proceedings of the National Academy of Sciences, 110(17):6913–6918, 2013.
  • [28] Yan Bi and Hui Yang. Heterogeneity of strategy persistence promotes cooperation in spatial prisoner’s dilemma game. Physica A: Statistical Mechanics and its Applications, 624:128939, 2023.
  • [29] Genki Ichinose and Naoki Masuda. Zero-determinant strategies in finitely repeated games. Journal of theoretical biology, 438:61–77, 2018.
  • [30] Jia-Xu Han and Rui-Wu Wang. Complex interactions promote the frequency of cooperation in snowdrift game. Physica A: Statistical Mechanics and its Applications, 609:128386, 2023.
  • [31] Jialu He, Jianwei Wang, Fengyuan Yu, and Lei Zheng. Reputation-based strategy persistence promotes cooperation in spatial social dilemma. Physics Letters A, 384(27):126703, 2020.
  • [32] György Szabó and Csaba Tőke. Evolutionary prisoner’s dilemma game on a square lattice. Physical Review E, 58(1):69, 1998.
  • [33] Guoyong Mao and Ning Zhang. Fast approximation of average shortest path length of directed ba networks. Physica A: Statistical Mechanics and its Applications, 466:243–248, 2017.
  • [34] Chunyan Zhang, Siyuan Liu, Zhijie Wang, Franz J Weissing, and Jianlei Zhang. The “self-bad, partner-worse” strategy inhibits cooperation in networked populations. Information Sciences, 585:58–69, 2022.
  • [35] Chao Luo, Xiaolin Zhang, Hong Liu, and Rui Shao. Cooperation in memory-based prisoner’s dilemma game on interdependent networks. Physica A: Statistical Mechanics and its Applications, 450:560–569, 2016.