A Project Studying Cooperation in Games
Through Modeling in Terms of Formally
Non-Cooperative Action in a Repeated
Game Context
In 1996, at a summer
"science camp" I gave a talk on the topic of
the use of the "Prisoner's
Dilemma" game, in a context of repetition and
evolution of strategies, by theoretical
biologists who were interested
in studying the natural evolution of
cooperative adaptations. And after
giving the talk I thought more about
the concept of studying a game by
studying it as a repeated game and
through this viewpoint I got an idea
of how to eliminate all of the
"verbal" complications that could become
involved in the consideration of
coalitions and coalition formation.
In principle, coalitions,
and specifically coalitions as considered by
Von Neumann and Morgenstern in
"Theory of Games and Economic Behavior",
are things that could be implemented by
contracts, like contracts in
roman law. But of course a contract is
quite intrinsically a "verbal"
thing because indeed it could (or
should!) be written down in words.
My idea was that in a
repeated game context that the players could be
given the right to vote for
"agencies" or "agents" among themselves. Thus
at a first step a player, say player A,
would have the option to accept
player B as his agent. And the effect
of this would be that the coalition
(A,B) would be formed (like a committee
with B as chairman) without any
verbal processes occurring between A
and B. Furthermore, this process
adapts to successive steps of
coalescence since if another step of
elections is held then B, as the
"agent" representing the coalition (A,B),
can vote to accept the agency of player
C and then C will thus become the
agent representing the coalition
(A,B,C).
And in this manner, with
generalized "agencies" being electable,
the level of a "grand
coalition" can always be reached (for a game of
finitely many players), and as a
consequence of that the requisites for
"Pareto efficiency" will be
available.
More on Agencies
Instead of there being
unlimited means by which coalitions might
actually be formed or form, dissolve,
and reform, we must specify an
election procedure that can form the
basis for a game of non-cooperative
form to be played by the players of the
original game being translated.
And in the context of studying a
repeated game we can afford to prescribe
that this election process is such that
the agent is entirely uncommitted
and his election is irrevocable for
each specific playing of the game.
(Of course election choices are
expected to vary as the game is repeated.)
A set of rules can be
devised so that there are election stages in
each of which all of the players
remaining independent (not represented
by another player as agent) have, each
of them, the option of electing
another player as an accepted agent. It
is natural for these rules to
require convergence so that no more
than (n-1) stages of election will
ever be needed for a game of n players.
Election rules need to do
something to resolve the impasse of a
situation where A votes to accept B as
his agent but B simultaneously
also votes similarly for A. It is not
exactly clear, a priori, which rule
version handles these situations in the
most preferable fashion, we have
worked with more than one variant. When
we more recently found, in the
course of the use of specific model
games for calculations, that it seemed
to be desirable to allow elections to
be repeated when an election had
failed to result in the election of any
agency power, this finding had the
effect of suggesting that election
rules which had the effect that at most
one agency could be elected at any
stage of the election process would be
most convenient. (And furthermore the
actual results of calculations seem
to indicate that this convention on
elections is "asymptotically non-
prejudicial" since the probability
of simultaneous votes seemed to tend
towards zero while the probability of
"successful" elections tended
towards +1.)
If one begins with a quite
general CF game and then if one introduces
a game formally requiring that all
coalition benefits must be attained
through the means of the action of
agents who have the authority to
represent all the members of the
coalition then the "agencies game"
resulting from this still has the same
validly derivable characteristic
function as the original game. In
essence the coalitions simply have
the same potentialities as before, but
in a formal sense, for these
potentialities to be exploited, the
members of a coalition would need
TO CONSPIRE on a practical procedure
for electing agents successively and
finally taking the effective action of
the coalition as an action of the
agent finally elected to represent all
of the members of that coalition.
Current Form of Our Work
We have found that it is
possible to find "bargaining equilibria" in
games of two or three players using
models or modeling of this sort. And
some problems or details of
complication have appeared which have led to
consideration of refinements in the
models.
Ideally, there could be
evolved and developed an analogy to the
mathematical methods (studies of PDE
descriptions for the air flow, etc.)
that are used in weather prediction. On
a materialistic-commercial level
this would naturally apply to the
details of the deals in big corporation
mergers (like of Pfizer and Pharmacia
recently). Or also we can hope to
get more insight relating to existing
value or evaluation concepts such
as, notably, the Shapley value or the
nucleolus.
This program of research
has led, in connection with the approach
to coalitions and cooperation via the
model of agencies, to the
consideration of varied models of how
the players or participants can
"react" to favorable or
unfavorable behavior by other players in the
repeated game and to the study of
varied concepts of how the players
may choose "demands" that
regulate their reactive behavior. For example,
in the analogous area of the study by
theoretical biologists of the
possibility of the evolution of
cooperation in repeated games of PD type,
there have been found various types of
"reaction instinct" that can work
to favor cooperation. Besides the
simplest "tit for tat" instinct there
can be more complex variants which may
require more memory (of recent past
game experiences in the sequence of
repetitions). The analogous situation
in terms of models with the election of
agencies is that the "demands"
that players could be allowed to make
might be more or less elaborately
structured.
I am now working on a more
complicated model (for three players)
than that previously studied last year
because an expedient used in that
simpler model to reduce the number of
equations and variables to be solved
for seemed to lead to problems (on
which more comments later).
Common Features of Last Year's
Model and the Current Model
These models are the same
in having a procedure of elections by means
of which the agency powers are elected.
In the previous model there was
a simpler procedure employed after a
first stage of agency elections had
been effective in reducing the
remaining set of active players to merely
two. We used a sort of natural
bargaining scheme through which competing
choices, by the remaining players, of
utility allocations could be
rationalized. However this device of
simplification did not follow
straightforwardly the basic concept of
the agencies as players in the same
sort of game interaction as for all the
players at the beginning of the
playing of the game.
Both versions began with a
first stage of elections where each of
the three players could vote (or
"choose") any specific one of the other
two players to be his agent. In either
version these voting or choosing
actions (as repeatedly taken in the
repeated game context) are described
by numbers that represent, effectively,
the BEHAVIOR involved, or the
probability of the action being taken
when the opportunity is presented.
Thus we have an array of
six numbers (and three "implied numbers") that
describe this:
For
Player 1: For Player
2: For Player
3:
a1f2 & a1f3
a2f1 &
a2f3 a3f1
& a3f2
And these represent the
probabilities of specific voting choices. Thus,
for example, a2f1 is the probability
that Player 2 will choose (at the
first stage of the game) to elect that
Player 1 should be authorized to
act as his agent (as if with
"power of attorney"). So it is "acceptance
rate of P2 for P1" in the concept
underlying the notation.
The other aifj variables
have parallel meanings. And it is sometimes
convenient to use another type of
symbol like n3 = 1 - a3f1 - a3f2 which
is the probability that Player 3
chooses NOT to accept either of Players 1
or 2 as his agent; or thus this is the
probability that Player 3 votes
"neither" or for himself.
And since the three
players make their first votes simultaneously there
can be various outcomes. We have chosen
a rule to simplify the process
by which there is the proper response
to the votes. If the number of
acceptance votes cast by the players at
the first voting opportunity is
more than one then we apply a rule that
only one of these votes, chosen
at random, is to be accepted, so that
the result of the election is either
(1): that one of the players has
elected some other player to be his agent
or (2): that none of the players has
voted to accept any other player to
be an agent representing his
interest.
We have also introduced a
convention that in case of a failure of the
players to come together at all so that
no agency has been elected that,
with a certain probability, the first
stage of elections can be repeated.
This idea was also used in our prior
model. The probability of the players
being given another chance to elect an agent
is called (1-e4) or "one
minus epsilon sub 4" and the idea
is that we want to study results of
calculations as e4 tends asymptotically
to zero. (It was found, in the
case of the previous model, that as e4
would tend towards zero the
probabilities like a1f2 would also tend
towards zero but that agencies
would be elected with high probability
because of the continued offering
of "second chances" for the
basic action of coalescence (the election of
an agency).)
The Second Stage of the
Election of Agencies
In our prior model, after
one agency had been elected and only two
active players remained, we based the
payoff on the utility assignments
which had been chosen by the two
remaining players, taking into
consideration two numbers chosen by the
player that had been elected
as an agent and one number chosen by
the remaining solo player. In the
model currently being studied the
approach is, in a sense, more "orthodox"
with respect to the idea of agencies
and all possibility of general
cooperation is reduced to the idea of
the final election of a "general
agent". So when one player has
accepted another as his agent there
then remain two freely acting players
and the level of cooperation
corresponding to the "grand
coalition" is not realized until one of
them has elected the other as his
agent.
But if this final agency
election fails to be realized then we can
allow the existing agency to exploit
the resources of the two player
coalition formed by the two players
involved. (And in a simple case,
like we consider, this can lead simply
to the use of the resources
specified by the characteristic
function for the coalition of that pair
of players.)
Similarly with the idea
for the first stage of elections we allow the
second stage to be repeated, with a
probability of (1-e5), if neither side
has elected the other side to agency
power, and the idea is that we want
to study the limiting form of the
results as e5 asymptotically goes to
zero.
Once a "general
agent" has been elected then he/she has the privilege
of being able to allocate payoffs, from
the total of the payoff utility
resources available, to all of the
players, including himself. Our model
has simply a resource total available
of +1 which corresponds also to the
Pareto boundary of the game.
For each player there are
four possible ways by which he could have
become chosen as final agent. Either of
two players may have initially
elected him, and this leads to two
cases, or the other two players may
have initially combined in either of
two ways followed by his being
elected by the agent of that two player
combination. And then as final
agent he has essentially to choose a
point in 2 dimensions to determine
his choice of Pareto-accessible utility
allocations.
This leads to 8 dimensions
for each player and there are three players
so that this leads to 24 dimensions in
all for the choices made by the
players when they specify allocations
of utility after being elected as
"final agent". These are 24
out of a total of 39 "strategy variables" that
are regarded as subject to the
individual and individually optimizing
choices by the players.
And the other 15
dimensions of strategic choice by the players
correspond to their choice options in
relation to their reactive behavior
(in the repeated game). The behavior of
the players that is affected by
or controlled by their reactive
strategy choices is, in general, their
"acceptance behavior".
The Third Stage of the Game;
the Allocation of Utility
When two stages of agency
election are complete then one of the
original players has become the agent
for all and he "allocates" the
payoffs. These are presumed to be
Pareto-efficient and so we suppose
that he/she specifies three
non-negative numbers with sum = +1. The
information for this is specified by
the amounts allocated to the other
players and that is two numbers. This
leads to 24 strategic choices in
all, of this type, for all the players.
For example, in cases of
type UjBijRk player number i is in control and
was first elected by Player j and then
by player k and player i chooses to
allocate the payoff of ujbijrk to
player j (and ukbijrk to Player k, but
that is another case of allocation
strategy variables).
Or for example, u1b3r21 is
decided upon by Player 3, who was elected by
Player 2 after Player 2 had been chosen
by Player 1 in the first stage.
This is the amount allocated to Player
1 and u2b3r21 would be the amount
allocated to Player 2 (who has a
different position in the history of the
elections). Player 3 of course also
allocates u3b3r21 to himself, but this
is eliminated from our system of 42
equations in 42 variables because of
its simple relation to the other two
allocations by Player 3 here.
So there are 24
"utility allocation" variables (which correspond to
strategic choices by the players) and
they group into the 4 categories
of UjBijRk, UkBijRk, UjBiRjk, and
UkBiRjk.
"Demands" and Behavior in the
Second Stage of the Game
When the "second
stage" is reached one player has become an agent,
another player is represented through
this agency, and a third player
remains solo.
Suppose that Player 1 now
represents 2 and that 3 remains solo. We call
a12, short for a12f3, the probability
that 1 now chooses to vote for 3 as
the final agent. (This is observable
behavior on the part of Player 1 in
the repeated game.) And we call af12,
short for a3f12, the complimentary
probability that Player 3 will vote to
accept Player 1 (who already
represents Player 2) as the final
agent. These classifications lead to 12
numbers, six of each type.
And the 12 numbers ARE NOT
"strategic" choices by the players involved,
rather we arrange that they are
determined by "reactive behavior"
regulated by "demands" which
are the actual strategic choices for the
players.
For example a12 (or a12f3)
is specified to be A12/(1+A12) where
A12 will be a positive number. This
makes a12 a positive number less
than +1. And the quantity A12
controlling a12 is specified to be
A12 = Exp[ (u1b3r12 - d12)/e3 ] .
Here e3, or "epsilon
sub 3", is intended to be made ultimately very
small, as we study the equilibria of
the model. That smallness will make
A12 react sharply as d12 and u1b3r12
vary in relation to each other.
The number "d12" is the
"demand" chosen (strategically) by Player 1
in relation to this situation where he
can vote to accept Player 3 as
general (final) agent or alternatively
wait and hope that Player 3 will
accept him instead (!). And what the
formula takes into consideration
is simply the prospective gain or
payoff to Player 1 in the case where
Player 3 becomes general agent and
previously Player 1 had been elected
to represent Player 2, and this is
specifically u1b3r12.
There are 6 demand
strategy numbers like d12 (which controls a12). And
there are also six quite analogous
strategy choices like, for example,
df23 controlling af23 (or a1f23). So
"df23" is a choice by player 1, since
it controls a1f23 or the probability of
the acceptance by Player 1, as a
solo player in stage 2 of a game, of
the agency of Player 2 when Player 2
is already representing Player 3.
So we have af23 =
AF23/(1+AF23) or a1f23 = A1F23/(1+A1F23) with the
relation of AF23 = Exp[ (u1b23r1 -
df23)/e3 ] being specified for the
control of the acceptance behavior by
the (strategic) demand choice. Or
this could be called, in longer
notation, A1F23=Exp[(u1b23r1-d1f23)/e3] .
Demands and Acceptance
Behavior at Stage One
At the first stage of
elections, when all three players are "solo" we
have made a choice, conventionally, of
how to relate election behavior
to "demands". The choice made
is not absolutely free from any possible
arbitrariness and something more
complex might also be considered
appropriate.
Each player, like e.g.
player 2, has the option of voting either for
Player 1 (behavior of probability a2f1)
or for Player 3 (a2f3) or for not
voting for either of them (behavior
described by n2 = 1 - a2f1 - a2f3 ).
The model, in the same manner as the
previously studied model, relates
these behavior-describing numbers (or
probabilities) to a single demand
parameter called d2 which is all of the
strategic choice by Player 2
that relates to stage 1 of the game. We
specify in the model that
positive numbers A2f1 and A2f3 are
determined (controlled by d2) and
that a2fj = A2fj/(1 + A2f1 +
A2f3) where j is either 1 or 3.
A2fj is specified to be
Exp[ ( q2j - d2 )/e3 ] where q2j is specified
to be the calculated expectation of
payoff, to Player 2, under the
hypothesis that the game has passed to
stage 2 with Player 2 having become
represented by Player j as his agent.
So Player 2 chooses, strategically,
the demand d2 which is interpretable as
describing what Player 2 "demands"
he/she should expect in gains if
either, (q21), Player 1 becomes his
representing agent or, (q23), Player 3
becomes the agent representing
Player 2 in stage 2 of the game.
Then the three strategy
variables, d1, d2, and d3, control the six
behavior probabilities a1f2, a1f3,
a2f1, a2f3, a3f1, and a3f2 which
completely describe the actual
(observable) behavior of the players in
stage 1.
Variables in the Model
In all we have 39
"strategic" variables in the model, 15 "demands"
and 24 choices of "utility
allocations". But we replace all the
demand variables, like d23 or d1, by
associated controlled behavior
probabilities like a23 or a1f2 and
a1f3. And then we arrive at simpler
equations with most of the appearance
of exponential functions being
eliminated.
It is a matter of
practical considerations, how to find actual
numerical solutions of the resulting
equations. There is experience
with this problem from work on the
previous simpler model but for the
current model the work is far from
complete but has arrived at a reliable
derivation of the actual equations to
be solved (in good form for study
with applicable computer software) with
the aid of the work of an NSF
project assistant (AK). Furthermore
some numerical calculations first for
completely symmetric game and then for
general non-symmetric games give
indications that the model is working
as might be hoped, at least for
small values of the 2-player coalition
strengths. But it is too early
to really venture any comments about
the asymptotic patterns of the
solutions or to interpret the results
of calculations.
When enough calculations
have been made for a variety of non-symmetric
games then the implied values inferred,
via the calculated payoffs,
may give numbers that can be compared
with those of various relevant
concepts, like the nucleolus, etc.
So the challenge remains
to be dealt with of actually finding enough
of the instructively revealing results
from numerical solutions for
equilibria of the model.
Comparisons from Results
The model is designed so
that the game can be of a type where the
Shapley value and the nucleolus give
quite different "evaluations" of the
game. Which of these, for example as a
guide usable in an "arbitration
scheme", is better or worse (if
such comparisons could be valid in any
sense)? Naturally, any other route
which leads, however it does so, to an
"evaluation", can lead to a
basis for comparison of these, and of other
approaches that give evaluations.
Our modeling, as also with
the previous model, has three parameters
that describe the resources available
to each of the two-player
coalitions. If these are small positive
numbers, say of size less than
1/3, then the standard nucleolus
evaluation effectively disregards them
although the Shapley value gives them a
modest weighting in its evaluation
of the game.
Our previous model turned
out to give numerical results in cases of
this sort where, depending on the ratio
of two "epsilons" (comparable to
"e3" as described above) the
evaluation could be either "hyper-Shapleyan"
(compared with the nucleolus at
1/3,1/3,1/3) or "sub-Shapleyan".
Ultimately I came to the
conclusion that it was needed to have a MORE
UNIFORM concept of
"fuzziness" to be applied in smoothing out the
sharpness of the effects of
"demand" choices. Otherwise, and this became
evident in earlier studies of
two-player models, if one player would have
"sharper" demands and the
other would make "duller" demands then the
player with the sharper demands would
indeed become like a "sharper"
bargainer and would tend to gain an
advantage in the calculated game
outcomes! The fuzziness is needed for
mathematical reasons, so that
derivatives of smooth functions can be
computed. But under certain
circumstances it seems that
"unbalanced" fuzziness can somehow "prejudice"
the game evaluation objective.
Associated Studies
The work on this project
which seeks to exploit the "agencies" concept
to successfully study games with
cooperation via a reduction to non-
cooperative equilibrium considerations
has led me also to study some
other topics.
One of these was the
computability of the nucleolus, since if a variety
of sample games are studied, and if
these are of the sort where the
classical nucleolus and the Shapley
value are defined, then just for the
comparison of numerical results it is
nice to have a quick method for
finding the numbers that form the
nucleolus vector. It occurred to me that
that should be possible by a
programmable method of Monte Carlo style in
which random perturbations would be
used in finding, to a high level of
approximation, the numerical values of
the nucleolus components.
I tried programming that
for use in MATHEMATICA knowing that often the
game would be defined by rational
numbers and that a close approximation
to the exact nucleolus would lead to
finding the exact answer by finding
the simple rational numbers for which
approximations had been found.
In the method of
successive approximations the actual definition of the
nucleolus would be the basis for the
criterion for the comparison of the
merits of various random perturbations
made to an approximate vector.
Later I learned of the
work of Sven Klauke in Bielefeld where this sort
of a method has been the basis of an
effective program in C++ and where
there has also been developed an
effective program that works by the
method of reducing the problem to
problems of a "linear programming" type.
And after being at the
Stony Brook meeting of last year I thought about
the characteristic function as
calculated by Harsanyi (this was around
1960) in connection with developing a
general solution concept for
cooperative games. And then I learned
that if this approach were used to
transform the interpretation of a game
of three players and if after that
the nucleolus were calculated for the
(constant sum) game thus obtained
that this would result in THE SAME
vector as that of the Shapley value for
the game (which itself would not be
changed by the move to the "Harsanyi
characteristic function").
On the other hand, if we
consider similar types of four person games
then the nucleolus computed after
transforming the coalition values
information via the Harsanyi
characteristic function will not, in general,
match the Shapley value vector.
And now this seems to me
curiously parallel to how Von Neumann, in 1928,
concluded that 3-person cooperative
games could be, somehow, evaluated,
but that 4-person games were of a
different order of difficulty, in terms
of evaluation!