In the mathematical field of game theory, the iterated prisoner’s dilemma is a two-player, turn-based game in which participants can choose to cooperate with, or defect against, their opponents. Defections have higher payoffs, but carry the almost certain risk of retaliation by opposing players.

Over the years, hundreds of strategies have been developed for IPD, ranging from very simple (e.g. Tit-For-Tat, which simply mirrors its opponents last play), to highly complex, involving advanced math theories and drawing from psychology, biology, and evolutionary studies.

EXP-RTL (Exponential Retaliation) is my addition to the existing pool of strategies. EXP-RTL will cooperate if it makes the first the move of the game, and it will continue cooperating provided its opponent does the same.

If its opponent defects, EXP-RTL increments a _grudges variable by 1, computes the current value of _grudges raised to the second power, and adds the output to a _retaliations variable. It checks the value of _retaliations on every move and defects if it finds it greater than zero. _retaliations is decremented by 1 after each defection, and if it reaches zero, EXP_RTL will resume cooperating. _grudges is never decremented, so the amount of retaliation inflicted by EXP-RTL grows exponentionally with every defection it encounters.

I used the great Axelrod library, which has hundreds of IPD strategies bundled in a very accessible format, as well as some highly convenient tournament functions and evolutionary processes.

 

Below is an implentation of the strategy in the Python progromaming language:

 

temp.py

   1 
   2 
    import axelrod as axl
   3 
    import matplotlib.pyplot as plt
   4 
    import random
   5 
 
   6 
 
   7 
 
   8 
 
   9 
    class EXP_RTL(axl.player.Player):
  10 
       
  11 
        """
  12 
        Player cooperates as long as Opponent does the same. If Opponent
  13 
        defects, Player increments a _grudges variable by one, computes the
  14 
        current value of _grudges raised to the second power and adds the output
  15 
        to a _retaliations variable.
  16 
       
  17 
        As long as _retaliations > 0, player will defect on every turn. After each
  18 
        defection by Player, _retaliations is decremented by one, and Player won't
  19 
        resume cooperating until _retaliations == 0.
  20 
       
  21 
        If Opponent defects while Player is still retaliating, Player increments
  22 
        _retaliations by the new value of grudges raised to the second power.
  23 
       
  24 
        """
  25 
 
  26 
       
  27 
 
  28 
        name = "EXP-RTL"
  29 
 
  30 
       
  31 
       
  32 
        classifier = {
  33 
            "memory_depth": float("inf"),
  34 
            "stochastic": False,
  35 
            "makes_use_of": set(),
  36 
            "long_run_time": False,
  37 
            "inspects_source": False,
  38 
            "manipulates_source": False,
  39 
            "manipulates_state": False,}
  40 
 
  41 
      
  42 
       
  43 
        def __init__(self):
  44 
            super().__init__()
  45 
 
  46 
            """
  47 
            Inherits all the attributes and methods of the Player superclass and
  48 
            initializes the variables that will be manipulated during gameplay.
  49 
           
  50 
            """
  51 
 
  52 
            self._grudges = self._retaliations = 0
  53 
 
  54 
 
  55 
       
  56 
        def strategy(self, opponent):
  57 
 
  58 
           
  59 
            #Defines the strategy that will be used by the agent
  60 
 
  61 
 
  62 
            C, D = axl.action.Action.C, axl.action.Action.D
  63 
 
  64 
            if not opponent.history:
  65 
                return C
  66 
 
  67 
            if opponent.history[-1] == C:
  68 
                if self._retaliations == 0:
  69 
                    return C
  70 
                self._retaliations -= 1
  71 
                return D
  72 
 
  73 
            if opponent.history[-1] == D:
  74 
                self._grudges += 1
  75 
                self._retaliations += (self._grudges ** 2) - 1
  76 
                return D
  77 
 
  78 
 
  79 
       
  80 
        def tournament(self):
  81 
 
  82 
            """
  83 
            Runs a tournament between EXP_RTL and all non-longrunning
  84 
            and non-cheating strategies in the Axelrod database. Strategies with
  85 
            very high computational cost and/or try to "cheat", eg by modifying
  86 
            their own or their opponents source code, have been excluded.
  87 
           
  88 
            The results of the tournament are stored to file, along with a series
  89 
            of plots that visualize the results.
  90 
           
  91 
            """
  92 
           
  93 
            #Uncomment code below to run tournament with random seed:
  94 
            #axl.seed(0)
  95 
 
  96 
            players = [self] + [
  97 
                    strategy() for strategy in axl.strategies 
  98 
                    if strategy.classifier["long_run_time"] == False]
  99 
           
 100 
            tournament = axl.Tournament(
 101 
                players,
 102 
                turns = 200,
 103 
                repetitions = 5)
 104 
           
 105 
            results = tournament.play()
 106 
 
 107 
            # Write results to file
 108 
            with open(
 109 
                "results.txt",
 110 
                "a",
 111 
                encoding = "UTF-8") as resultsfile:
 112 
               
 113 
                for player_rank in results.ranked_names:
 114 
                    resultsfile.write(f"{player_rank}\n")
 115 
 
 116 
            # Write more comprehensive tournament summary to CSV file
 117 
            results.write_summary("tournament_summary.csv")
 118 
 
 119 
            # Create visual representations of the tournament results with plots
 120 
            plot = axl.Plot(results)
 121 
 
 122 
            """
 123 
            All plots are saved to file, including boxplot,
 124 
            winplot, payoff matrix, and more.
 125 
           
 126 
            """
 127 
            plot.save_all_plots("tournament_visualization")
 128 
 
 129 
 
 130 
       
 131 
        def moran_process(self):
 132 
           
 133 
            """
 134 
           
 135 
            Pits EXP_RTL and 19 strategies chosen at random in the Moran process,
 136 
            a stochastic population process simulating natural selection. Due to
 137 
            the method's high computational cost, only 20 strategies compete at a
 138 
            time, but the random sampling ensures that different strategies are
 139 
            included every time the method is called. with EXP-RTL being the only
 140 
            constant. 
 141 
           
 142 
            """
 143 
 
 144 
            players = [self] + [
 145 
                    strategy() for strategy in random.sample(
 146 
                    axl.strategies, k = 10) if strategy.classifier[
 147 
                    "long_run_time"] == False]
 148 
 
 149 
            mp = axl.MoranProcess(
 150 
                players = players,
 151 
                turns = 200)
 152 
           
 153 
            populations = mp.play()
 154 
 
 155 
            # Write the sequence of populations to file
 156 
            with open(
 157 
                "population_sequence.txt",
 158 
                "a",
 159 
                encoding = "UTF-8") as sequences:
 160 
               
 161 
                for sequence in populations:
 162 
                    sequences.write(f"{sequence}\n")
 163 
 
 164 
            print("\n", f"Winning Strategy: {mp.winning_strategy_name}", "\n")
 165 
 
 166 
            # Create a visual representation of the results
 167 
            ax = mp.populations_plot()
 168 
           
 169 
            plt.show()
 170 
 
 171 
 
 172 
 
 173 
 

 

Below is a plot of EXP-RTL and nine other strategies competing in the Moran process, a technique that mimics natural selection. Note that this is without mutation.

 

Moran process

 

Full code repo available on Github.

 

Related posts:

Speculative Fiction Bot

Square Neighborhood Algorithm: Balancing Exploration And Exploitation In Optimization

Interleaved Neighborhood Algorithm: Fully Exploratory Optimization

   

About Me