Modeling
in Terms of Formally
Action
in a Repeated Game Context
on
the topic of the use of the "Prisoner's Dilemma" game, in a context of
repetition and evolution
cooperative adaptations. And after giving the talk
I thought more about the concept
of studying a game
by
studying it as a repeated game and through this viewpoint I got an idea
of how to eliminate all of the "verbal" complications that could become
involved in
the
consideration of coalitions and coalition formation.
In
principle, coalitions, and specifically coalitions
could (or should!) be written down in words.
My
idea was that in a repeated game context that the players could be given
the right to vote for "agencies" or "agents" among themselves. Thus at
a first step a player, say player A, would have the option to accept player
B as his agent. And the effect of this would
be
that the coalition (A,B) would be formed (like a committee with B as chairman)
without any verbal processes occurring between A and B. Furthermore, this
process adapts to successive steps of coalescence since if another step
of elections is held then B, as the "agent" representing the coalition
(A,B), can vote
to
accept the agency of player C and then C will thus become the agent representing
the coalition (A,B,C).
And
in this manner, with generalized "agencies" being electable, the level
of a "grand coalition" can always be reached (for a game of finitely many
players), and
as
a consequence of that the requisites for "Pareto efficiency" will be available.
and
reform, we must specify an election procedure that
can
form the basis for a game of non-cooperative form
to
be played by the players of the original game being translated. And in
the context of studying a repeated
game
we can afford to prescribe that this election
process
is such that the agent is entirely uncommitted
and
his election is irrevocable for each specific playing of the game. (Of
course election choices are expected to vary as the game is repeated.)
A
set of rules can be devised so that there are election stages in each of
which all of the players remaining independent (not represented by another
player as agent) have, each of them, the option of electing another player
as an accepted agent. It is natural for these rules to require convergence
so that no more than (n-1) stages of election will ever be needed for a
game
of
n players.
Election
rules need to do something to resolve the impasse of a situation where
A votes to accept B as his agent but B simultaneously also votes similarly
for A. It is not exactly clear, a priori, which rule version handles these
situations in the most preferable fashion, we have worked with more than
one variant. When we more recently found, in the course of the use of specific
model games for calculations, that it seemed to be desirable to allow elections
to be repeated when an election had failed to result in the election of
any agency power, this finding had the effect of suggesting that election
rules which had the effect that at most one agency could be elected at
any stage of the election process would be most convenient. (And furthermore
the actual results of calculations seem to indicate that this convention
on elections is "asymptotically non-prejudicial" since the probability
of simultaneous votes seemed to tend towards zero while the probability
of "successful" elections tended towards +1.)
If
one begins with a quite general CF game and then
if
one introduces a game formally requiring that all coalition benefits must
be attained through the means of the action of agents who have the authority
to represent all the members of the coalition then the "agencies game"
resulting from this still has the same validly derivable characteristic
function as the original
game.
In essence the coalitions simply have the same potentialities as before,
but in a formal sense, for these potentialities to be exploited, the members
of
a coalition would need TO CONSPIRE on a practical procedure for electing
agents successively and finally taking the effective action of the coalition
as an action of the agent finally elected to represent all
of
the members of that coalition.
be
evolved and developed an analogy to the mathematical methods (studies of
PDE descriptions for the air flow, etc.) that are used in weather prediction.
On a materialistic-commercial level this would naturally apply to the details
of the deals in big corporation mergers (like of Pfizer and Pharmacia recently).
Or also we can hope to get more insight relating to existing value or evaluation
concepts such as, notably, the Shapley value or the nucleolus.
This
program of research has led, in connection with
the
approach to coalitions and cooperation via the concept of agencies, to
the consideration of varied models of how the players or participants can
"react"
to
favorable or unfavorable behavior by other players in the repeated game
and to the study of varied concepts of how the players may choose "demands"
that regulate their reactive behavior. For example, in the analogous area
of the study by theoretical biologists of the possibility of the evolution
of cooperation in repeated games of PD type, there have been found various
types of "reaction instinct" that can work to favor cooperation. Besides
the simplest "tit for tat" instinct there can be
more
complex variants which may require more memory
(of
recent past game experiences in the sequence of repetitions). The analogous
situation in terms of models with the election of agencies is that the
"demands" that players could be allowed to make might be more or less elaborately
structured.
I
have been studying a more complicated model (for three players) than that
previously studied (in 2001) because an expedient used in that simpler
model to reduce the number of equations and variables to be solved for
seemed to lead to problems. The payoff results seemed to depend on the
comparative smallnesses of two different parameters, both of which were
supposed to be decreased asymptotically to zero.
Model
and the Current Model
of
elections by means of which the agency powers are elected. In the previous
model there was a simpler procedure employed after a first stage of agency
elections had been effective in reducing the remaining set of active players
to merely two. We used a sort of natural bargaining scheme through which
competing choices, by the remaining players or agents, of utility allocations
could be rationalized. However this device of simplification did not follow
straightforwardly the basic concept of the agencies as players in the same
sort of game interaction as that for all the players
at
the beginning of the playing of the game.
Both
versions begin with a first stage of elections where each of the three
players could vote (or "choose") any specific one of the other two players
to be his
agent.
In either version these voting or choosing actions (as repeatedly taken
in the repeated game context) are described by numbers that represent,
effectively, the BEHAVIOR involved, or the probability of the action being
taken when the opportunity is presented.
Thus
we have an array of six numbers (and three "implied numbers") that describe
this:
The other aifj variables have parallel meanings.
And it is sometimes convenient to use another type
of symbol like n3 = 1 - a3f1 - a3f2 which is the probability that Player 3 chooses NOT to accept either of Players 1 or 2 as his agent; or thus this is the probability that Player 3 votes "neither" or for himself.
And
since the three players make their first votes simultaneously there can
be various outcomes. We have chosen a rule to simplify the process by which
there
is
the proper response to the votes. If the number of acceptance votes cast
by the players at the first voting opportunity is more than one then we
apply a rule that only one of these votes, chosen at random, is to be accepted,
so that the result of the election is either (1): that one of the players
has elected some other player to be his agent or (2): that none of the
players has voted to accept any other player to be an agent representing
his interest.
We
have also introduced a convention that in case
of
a failure of the players to come together at all so that no agency has
been elected that, with a certain probability, the first stage of elections
can be repeated. This idea was also used in our prior model. The probability
of the players being given another chance to elect an agent is called (1-e4)
or "one minus epsilon sub 4" and the idea is that we want to study results
of calculations as e4 tends asymptotically to zero. (It was found, in the
case of the previous model, that as e4 would tend towards zero the probabilities
like a1f2 would also tend towards zero but that agencies would be elected
with high probability because of the continued offering of "second chances"
for the basic action of coalescence (the election of an agency).)
Election
of Agencies
to
the idea of agencies and all possibility of general cooperation is reduced
to the idea of the final election of a "general agent". So when one player
has accepted another as his agent there then remain two freely acting players
and the level of cooperation corresponding to the "grand coalition" is
not realized until one of them has elected the other as his agent.
But
if this final agency election fails to be realized then we can allow the
existing agency to exploit the resources of the two player coalition formed
by the two players involved. (And in a simple case, like we consider, this
can lead simply to the use of the resources specified by the characteristic
function for the coalition of that pair of players.)
Similarly
with the idea for the first stage of
elections
we allow the second stage to be repeated, with a probability of (1-e5),
if neither side has elected
the
other side to agency power, and the idea is that we
Once
a "general agent" has been elected then he/she has the privilege of being
able to allocate payoffs, from the total of the payoff utility resources
available, to all of the players, including himself. Our model has simply
a resource total available of +1 which corresponds also to the Pareto boundary
of the game.
For
each player there are four possible ways by which
his
being elected by the agent of that two player combination. And then as
final agent he has essentially to choose a point in 2 dimensions to determine
his choice of Pareto-accessible utility allocations.
This
leads to 8 dimensions for each player and there
are
three players so that this leads to 24 dimensions
in
all for the choices made by the players when they specify allocations of
utility after being elected
as
"final agent". These are 24 out of a total of 39 "strategy variables" that
are regarded as subject to
the
individual and individually optimizing choices
by
the players.
And
the other 15 dimensions of strategic choice
by
the players correspond to their choice options in relation to their reactive
behavior (in the repeated game). The behavior of the players that is affected
by or controlled by their reactive strategy choices is,
in
general, their "acceptance behavior".
the
Allocation of Utility
all
and he "allocates" the payoffs. These are presumed to be Pareto-efficient
and so we suppose that he/she specifies three non-negative numbers with
sum = +1.
The
information for this is specified by the amounts allocated to the other
players and that is two numbers. This leads to 24 strategic choices in
all, of this type, for all the players.
For
example, in cases of type UjBijRk player number i
Or
for example, u1b3r21 is decided upon by Player 3, who was elected by Player
2 after Player 2 had been
chosen
by Player 1 in the first stage. This is the amount allocated to Player
1 and u2b3r21 would be the amount allocated to Player 2 (who has a different
position in the history of the elections). Player 3
of
course also allocates u3b3r21 to himself, but this
is
eliminated from our system of 42 equations in 42 variables because of its
simple relation to the other two allocations by Player 3 here.
So
there are 24 "utility allocation" variables (which correspond to strategic
choices by the players) and they group into the 4 categories of UjBijRk,
UkBijRk, UjBiRjk, and UkBiRjk.
Second
Stage of the Game
become
an agent, another player is represented through
this
agency, and a third player remains solo.
Suppose
that Player 1 now represents 2 and that
3
remains solo. We call a12, short for a12f3, the probability that 1 now
chooses to vote for 3 as the
final
agent. (This is observable behavior on the part of Player 1 in the repeated
game.) And we call af12, short for a3f12, the complimentary probability
that Player 3 will vote to accept Player 1 (who already represents Player
2) as the final agent. These classifications lead to 12 numbers, six of
each type.
And
the 12 numbers ARE NOT "strategic" choices by
the
players involved, rather we arrange that they are determined by "reactive
behavior" regulated by "demands" which are the actual strategic choices
for the players.
For
example a12 (or a12f3) is specified to be A12/(1+A12) where A12 will be
a positive number. This makes a12 a positive number less than +1. And the
quantity A12 controlling a12 is specified to be
A12
= Exp[ (u1b3r12 - d12)/e3 ] .
Here
e3, or "epsilon sub 3", is intended to be made ultimately very small, as
we study the equilibria of the model. That smallness will make A12 react
sharply as d12 and u1b3r12 vary in relation to each other. The number "d12"
is the "demand" chosen (strategically) by
1 in relation to this situation where he can vote
to accept Player 3 as general (final) agent or alternatively wait and hope
that Player 3 will accept him instead (!). And what the formula takes into
consideration is simply the prospective gain or payoff to Player 1 in the
case where Player 3 becomes general agent and previously Player 1 had been
elected to represent Player 2, and this is specifically u1b3r12.
There
are 6 demand strategy numbers of the same type
In
the second stage of agency elections an acceptance can
be made either by a single player accepting the coalition led by another
player (and thus that other
on the
"df23"
is a choice by player 1, since it is structured to control a1f23 or the
probability of
the
acceptance by Player 1, as a solo player in stage 2 of a game, of the agency
of Player 2 when Player 2 is already representing Player 3.
So
we have af23 = AF23/(1+AF23) or
a1f23
= A1F23/(1+A1F23)with the relation of
AF23
= Exp[ (u1b23r1 - df23)/e3 ]
being
specified
for
the control of the acceptance behavior af23 by
the
(strategic) demand choice df23. Or this could
be called, in longer notation,
A1F23
= Exp[ (u1b23r1-d1f23)/e3 ] .
Behavior
at Stage One
three
players are "solo" we have made a choice, conventionally, of how to relate
election behavior
to
"demands". The choice made is not absolutely free from any possible arbitrariness
and something more complex might also be considered appropriate.
Each
player, like e.g. player 2, has the option
of voting either for Player 1 (behavior of probability a2f1) or for Player 3 (a2f3) or for not voting
for either of them (behavior described by
n2
=
manner as our
previous model (of 2001), relates these behavior-describing numbers (or
probabilities) to a single demand parameter called d2 which is all of the
strategic choice by
of
the game. We
either
1 or 3.
A2fj
is specified to be Exp[ ( q2j - d2 )/e3 ]
where
q2j is specified to be the calculated expectation of payoff, to Player
2, under the hypothesis that the
game
has passed to stage 2 with Player 2 having become represented by Player
j as his agent. So Player 2 chooses, strategically, the demand d2 which
is interpretable as describing what Player 2 "demands" he/she should expect
in gains if either, (q21), Player 1 becomes his representing agent or,
(q23), Player 3 becomes the agent representing Player 2 in stage 2 of the
game.
Then
the three strategy variables, d1, d2, and d3, control the six behavior
probabilities a1f2, a1f3, a2f1, a2f3, a3f1, and a3f2 which completely describe
the actual (observable) behavior of the players in stage 1.
It is a matter of practical considerations, how
to
find actual numerical solutions of the resulting equations. There was experience
with this problem from work on the previous simpler model and much of the
programming earlier developed for use with MATHEMATICA for the finding
of numerical solutions could be adapted to the new model with its larger
number of variables. With the help of an NSF project assistant (AK) the
42 equations to be associated with the 42 unknown variables deriving from
the model were derived and cross-checked.
At
the present time there are some areas of example
games
for which solutions are known and some where no explicit solution has yet
been found. For example, for
a
game with two symmetrically favored players who can already gain a shared
payoff of 3/4 just by themselves, no proper solution has yet been found
although illegitimate solutions with a variable assigned a
game-theoretically
impermissible value have been found.
So
it is possible that for this example that some of the behavior-describing
variables should assume extreme values or become irrelevant. And then a
reformulation of the set of equations to be solved may properly handle
this example.
We
can illustrate the form of numerical results found
seems
to treat the values of the two-person coalition as
Players P1 and P2 are symmetrically situated and have
a coalition value
We
can also illustrate, with a graphics of a sort
first
prepared last summer, how the Pareto efficiency
of
the game outcome (derived from the payoffs to the players) tends to improve
as e3 tends towards zero. The effect of decreasing e3 is that the "demands"
made by the players become effectively "more precise" and that leads to
the improvement of the bargaining efficiency, as it were. These illustrative
graphics were prepared (by AK and SL) on the basis of calculated solutions
for a fully symmetric example game. We were able to find the first solutions
for fully symmetric games more easily since the effect of that symmetry
was to reduce the number of equations and variables from 42 to merely 7.
Other
researchers have also recently gotten into studies of the process of negotiating
a mode of cooperation. The work, for example, of Armando Gomes in Philadelphia
was interesting and for three person games his model was giving either
the nucleolus or the Shapley value as the evaluation result. Of course
any method which leads similarly to an "evaluation" of games becomes a
natural basis for an "arbitration scheme", using the words of Luce and Raiffa.
Our
modeling, as it was also with the previous model, has three parameters
that describe the values of each of the two-player coalitions. When these
are small positive numbers, say of size less than 1/3, then the nucleolus
evaluation effectively disregards them although the Shapley value gives
them a moderate weighting in its evaluation of the game.
The
results of the most recent computational studies using the model (in which
studies I am being assisted by the student SL working on the NSF grant
for the research project) are giving evaluation indications that are more
equitably distributed among the three players than those
If
two favored players can gain in their two-person coalition 2/3 of the total
available to the grand
coalition
(of all three players) then the computational results from our model indicate
significantly less payoff to the two favored players than would be indicated
either by the nucleolus or by the Shapley value.
It
seems plausible that some refinements in modeling
may
allow the players to more effectively cooperate in
terms
of the virtues of the "alliance possibilities"
involving
subsets of the total set of players. I have
thought
of some ideas here, of varying degrees of complexity. In principle it is
quite natural, returning
to
the analogy with biology and evolution, that there should be some desirable
behavior patterns that encourage cooperative behavior on the part of other
interacting biological "players". We already have "demands" in the modeling
and it is possible to design things, effectively, so that the players can
"demand" certain specific forms of cooperative behavior from other players.
For example, they might be enabled to "demand" that acceptance probabilities
should be more
in
reciprocal relationship. Since any player is competent to be an agent and thus a
leader it might be natural for a player to "demand" that he/she should
be elected to lead with frequency like that of another player.)
It
is notable in the specific model currently studied
be
expected to gain, when in a disfavored position in relation to the two-player
coalition values, that part of HOW Player 3 seems to manage this is by
having a rejecting rather than accepting behavior in relation to any elected
agency representing a coalition of Players 1 and 2. that
when Player 3 seems to gain more than he/she might
Ultimately
I came to the conclusion that it was needed to have a MORE UNIFORM concept
of "fuzziness"
to
be applied in smoothing out the sharpness of the effects of "demand" choices.
Otherwise, and this became evident in earlier studies of two-player models,
if one player would have "sharper" demands and the other would make "duller"
demands then the player with the sharper demands would indeed become like
a "sharper" bargainer and would tend to gain an advantage in the calculated
game outcomes! The fuzziness is needed for mathematical reasons, so that
derivatives of smooth functions can be computed. But under certain circumstances
it seems that "unbalanced" fuzziness can somehow "prejudice" the game evaluation
objective.
And
ultimately I thought of a "neutral" source of fuzziness by means of supposing
a general uncertainty about the actual boundary of the Pareto-accessible
utility outcomes.
in
the work of deriving and verifying the payoff functions and the equilibrium
equations (for equilibria
in pure strategies with parameters describing probabilities of actions
by the players). Also AK managed to effectively
More
recently ,starting in the Fall, SL has been the
student
assistant and has found sets of solutions for
varying
values of the adjustable parameters, for example