Project Studying Cooperation
Games Through Modeling in Terms
Formally Non-Cooperative Action
a Repeated Game Context
In 1996, at a summer "science camp" I gave a talk on the topic of the use of the "Prisoner's Dilemma" game, in a context of repetition and evolution of strategies, by theoretical biologists who were interested in studying the natural evolution of cooperative adaptations. And after giving the talk I thought more about the concept of studying a game by studying it as a repeated game and through this viewpoint I got an idea of how to eliminate all of the "verbal" complications that could become involved in the consideration of coalitions and coalition formation.
principle, coalitions, and specifically coalitions as considered by Von Neumann
and Morgenstern in "Theory of Games and Economic Behavior", are things
that could be implemented by contracts, like contracts in roman law. But of
course a contract is quite intrinsically a "verbal" thing because
indeed it could (or should!) be written down in words.
My idea was that in a repeated game context that the players could be
given the right to vote for "agencies" or "agents" among
themselves. Thus at a first step a player, say player A, would have the option
to accept player B as his agent. And the effect of this would be that the
coalition (A,B) would be formed (like a committee with B as chairman) without
any verbal processes occurring between A and B. Furthermore, this process adapts
to successive steps of coalescence since if another step of elections is held
then B, as the "agent" representing the coalition (A,B), can vote to
accept the agency of player C and then C will thus become the agent representing
the coalition (A,B,C).
And in this manner, with generalized "agencies" being electable,
the level of a "grand coalition" can always be reached (for a game of
finitely many players), and as a consequence of that the requisites for
"Pareto efficiency" will be available.
Instead of there being unlimited means by which coalitions might actually
be formed or form, dissolve, and reform, we must specify an election procedure
that can form the basis for a game of non-cooperative form to be played by the
players of the original game being translated. And in the context of studying a
repeated game we can afford to prescribe that this election process is such that
the agent is entirely uncommitted and his election is irrevocable for each
specific playing of the game. (Of course election choices are expected to vary
as the game is repeated.)
A set of rules can be devised so that there are election stages in each
of which all of the players remaining independent (not represented by another
player as agent) have, each of them, the option of electing another player as an
accepted agent. It is natural for these rules to require convergence so that no
more than (n-1) stages of election will ever be needed for a game of n players.
Election rules need to do something to resolve the impasse of a situation
where A votes to accept B as his agent but B simultaneously also votes similarly
for A. It is not exactly clear, a priori, which rule version handles these
situations in the most preferable fashion, we have worked with more than one
variant. When we more recently found, in the course of the use of specific model
games for calculations, that it seemed to be desirable to allow elections to be
repeated when an election had failed to result in the election of any agency
power, this finding had the effect of suggesting that election rules which had
the effect that at most one agency could be elected at any stage of the election
process would be most convenient. (And furthermore the actual results of
calculations seem to indicate that this convention on elections is
"asymptotically non-prejudicial" since the probability of simultaneous
votes seemed to tend towards zero while the probability of
"successful" elections tended towards +1.)
If one begins with a quite general CF game and then if one introduces a game formally requiring that all coalition benefits must be attained through the means of the action of agents who have the authority to represent all the members of the coalition then the "agencies game" resulting from this still has the same validly derivable characteristic function as the original game. In essence the coalitions simply have the same potentialities as before, but in a formal sense, for these potentialities to be exploited, the members of a coalition would need TO CONSPIRE on a practical procedure for electing agents successively and finally taking the effective action of the coalition as an action of the agent finally elected to represent all of the members of that coalition.
We have found that it is possible to find "bargaining equilibria"
in games of two or three players using models or modeling of this sort. Ideally,
there could be evolved and developed an analogy to the mathematical methods
(studies of PDE descriptions for the air flow, etc.) that are used in weather
prediction. On a materialistic-commercial level this would naturally apply to
the details of the deals in big corporation mergers (like of Pfizer and
Pharmacia recently). Or also we can hope to get more insight relating to
existing value or evaluation concepts such as, notably, the Shapley value or the
This program of research has led, in connection with the approach to
coalitions and cooperation via the concept of agencies, to the consideration of
varied models of how the players or participants can "react" to
favorable or unfavorable behavior by other players in the repeated game and to
the study of varied concepts of how the players may choose "demands"
that regulate their reactive behavior. For example, in the analogous area of the
study by theoretical biologists of the possibility of the evolution of
cooperation in repeated games of PD type, there have been found various types of
"reaction instinct" that can work to favor cooperation. Besides the
simplest "tit for tat" instinct there can be more complex variants
which may require more memory (of recent past game experiences in the sequence
of repetitions). The analogous situation in terms of models with the election of
agencies is that the "demands" that players could be allowed to make
might be more or less elaborately structured.
I have been studying a more complicated model (for three players) than that previously studied (in 2001) because an expedient used in that simpler model to reduce the number of equations and variables to be solved for seemed to lead to problems. The payoff results seemed to depend on the comparative smallnesses of two different parameters, both of which were supposed to be decreased asymptotically to zero.
Features of Last Year's
and the Current Model
These models are the same in having a procedure of elections by means of
which the agency powers are elected. In the previous model there was a simpler
procedure employed after a first stage of agency elections had been effective in
reducing the remaining set of active players to merely two. We used a sort of
natural bargaining scheme through which competing choices, by the remaining
players or agents, of utility allocations could be rationalized. However this
device of simplification did not follow straightforwardly the basic concept of
the agencies as players in the same sort of game interaction as that for all the
players at the beginning of the playing of the game.
Both versions begin with a first stage of elections where each of the three players could vote (or "choose") any specific one of the other two players to be his agent. In either version these voting or choosing actions (as repeatedly taken in the repeated game context) are described by numbers that represent, effectively, the BEHAVIOR involved, or the probability of the action being taken when the opportunity is presented.
Thus we have an array of six numbers (and three "implied
numbers") that describe this:
For Player 1: For Player 2:
For Player 3:
a1f2 & a1f3 a2f1
a3f1 & a3f2
And these represent the probabilities of specific voting choices. Thus,
for example, a2f1 is the probability that Player 2 will choose (at the first
stage of the game) to elect that Player 1 should be authorized to act as his
agent (as if with "power of attorney"). So it is the "acceptance
rate of P2 for P1" in the concept underlying the notation.
The other aifj variables have parallel meanings. And it is sometimes
convenient to use another type of symbol like n3 = 1 - a3f1 - a3f2 which is the
probability that Player 3 chooses NOT to accept either of Players 1 or 2 as his
agent; or thus this is the probability that Player 3 votes "neither"
or for himself.
And since the three players make their first votes simultaneously there
can be various outcomes. We have chosen a rule to simplify the process by which
there is the proper response to the votes. If the number of acceptance votes
cast by the players at the first voting opportunity is more than one then we
apply a rule that only one of these votes, chosen at random, is to be accepted,
so that the result of the election is either (1): that one of the players has
elected some other player to be his agent or (2): that none of the players has
voted to accept any other player to be an agent representing his interest.
We have also introduced a convention that in case of a failure of the players to come together at all so that no agency has been elected that, with a certain probability, the first stage of elections can be repeated. This idea was also used in our prior model. The probability of the players being given another chance to elect an agent is called (1-e4) or "one minus epsilon sub 4" and the idea is that we want to study results of calculations as e4 tends asymptotically to zero. (It was found, in the case of the previous model, that as e4 would tend towards zero the probabilities like a1f2 would also tend towards zero but that agencies would be elected with high probability because of the continued offering of "second chances" for the basic action of coalescence (the election of an agency).)
Second Stage of the
In our prior model, after one agency had been elected and only two active
players remained, we based the payoff on the utility assignments which had been
chosen by the two remaining players, taking into consideration two numbers
chosen by the player that had been elected as an agent and one number chosen by
the remaining solo player. In the model currently being studied the approach is,
in a sense, more "orthodox" with respect to the idea of agencies and
all possibility of general cooperation is reduced to the idea of the final
election of a "general agent". So when one player has accepted another
as his agent there then remain two freely acting players and the level of
cooperation corresponding to the "grand coalition" is not realized
until one of them has elected the other as his agent.
But if this final agency election fails to be realized then we can allow the existing agency to exploit the resources of the two player coalition formed by the two players involved. (And in a simple case, like we consider, this can lead simply to the use of the resources specified by the characteristic function for the coalition of that pair of players.)
Similarly with the idea for the first stage of elections we allow the
second stage to be repeated, with a probability of (1-e5), if neither side has
elected the other side to agency power, and the idea is that we want to study
the limiting form of the results as e5 asymptotically goes to zero.
Once a "general agent" has been elected then he/she has the
privilege of being able to allocate payoffs, from the total of the payoff
utility resources available, to all of the players, including himself. Our model
has simply a resource total available of +1 which corresponds also to the Pareto
boundary of the game.
For each player there are four possible ways by which he could have
become chosen as final agent. Either of two players may have initially elected
him, and this leads to two cases, or the other two players may have initially
combined in either of two ways followed by his being elected by the agent of
that two player combination. And then as final agent he has essentially to
choose a point in 2 dimensions to determine his choice of Pareto-accessible
This leads to 8 dimensions for each player and there are three players so
that this leads to 24 dimensions in all for the choices made by the players when
they specify allocations of utility after being elected as "final
agent". These are 24 out of a total of 39 "strategy variables"
that are regarded as subject to the individual and individually optimizing choices
by the players.
And the other 15 dimensions of strategic choice by the players correspond to their choice options in relation to their reactive behavior (in the repeated game). The behavior of the players that is affected by or controlled by their reactive strategy choices is, in general, their "acceptance behavior".
Third Stage of the Game;
Allocation of Utility
When two stages of agency election are complete then one of the original
players has become the agent for all and he "allocates" the payoffs.
These are presumed to be Pareto-efficient and so we suppose that he/she
specifies three non-negative numbers with sum = +1. The information for this is
specified by the amounts allocated to the other players and that is two numbers.
This leads to 24 strategic choices in all, of this type, for all the players.
For example, in cases of type UjBijRk player number i is in control and
was first elected by Player j and then by player k and player i chooses to
allocate the payoff of ujbijrk to player j (and ukbijrk to Player k, but that is
another case of allocation strategy variables).
Or for example, u1b3r21 is decided upon by Player 3, who was elected by
Player 2 after Player 2 had been chosen by Player 1 in the first stage. This is
the amount allocated to Player 1 and u2b3r21 would be the amount allocated to
Player 2 (who has a different position in the history of the elections). Player
3 of course also allocates u3b3r21 to himself, but this is eliminated from our
system of 42 equations in 42 variables because of its simple relation to the
other two allocations by Player 3 here.
So there are 24 "utility allocation" variables (which correspond to strategic choices by the players) and they group into the 4 categories of UjBijRk, UkBijRk, UjBiRjk, and UkBiRjk.
and Behavior in the
Stage of the Game
When the "second stage" is reached one player has become an
agent, another player is represented through this agency, and a third player
Suppose that Player 1 now represents 2 and that 3 remains solo. We call
a12, short for a12f3, the probability that 1 now chooses to vote for 3 as the
final agent. (This is observable behavior on the part of Player 1 in the
repeated game.) And we call af12, short for a3f12, the complimentary probability
that Player 3 will vote to accept Player 1 (who already represents Player 2) as
the final agent. These classifications lead to 12 numbers, six of each type.
And the 12 numbers ARE NOT "strategic" choices by the players
involved, rather we arrange that they are determined by "reactive
behavior" regulated by "demands" which are the actual strategic
choices for the players.
For example a12 (or a12f3) is specified to be A12/(1+A12) where A12 will
be a positive number. This makes a12 a positive number less than +1. And the
quantity A12 controlling a12 is specified to be
= Exp[ (u1b3r12 - d12)/e3 ] .
Here e3, or "epsilon sub
3", is intended to be made ultimately very small, as we study the
equilibria of the model. That smallness will make A12 react sharply as d12 and
u1b3r12 vary in relation to each other. The
number "d12" is the "demand" chosen (strategically) by
Player 1 in relation to this situation where he can vote to accept Player 3 as
general (final) agent or alternatively wait and hope that Player 3 will accept
him instead (!). And what the formula takes into consideration is simply the
prospective gain or payoff to Player 1 in the case where Player 3 becomes
general agent and previously Player 1 had been elected to represent Player 2,
and this is specifically u1b3r12.
There are 6 demand strategy numbers of the same type as d12 (which
controls a12). And there are also six quite analogous strategy choices like, for
example, df23 controlling af23 (or a1f23).
In the second stage of agency elections an acceptance can be made either
by a single player accepting the coalition led by another player (and thus that
other player’s leadership as final general agent). Or on the other hand the
player leading a coalition of two may accept the solo player as the final agent
"df23" is a choice by player 1, since it is structured to
control a1f23 or the probability of the acceptance by Player 1, as a solo player
in stage 2 of a game, of the agency of Player 2 when Player 2 is already
representing Player 3.
So we have af23 =
AF23/(1+AF23) or a1f23 = A1F23/(1+A1F23) with the relation of AF23 = Exp[
(u1b23r1 - df23)/e3 ] being
specified for the control of the acceptance behavior af23 by the (strategic)
demand choice df23. Or this could be called, in longer notation,
A1F23 = Exp[ (u1b23r1-d1f23)/e3 ] .
at Stage One
At the first stage of elections, when all three players are
"solo" we have made a choice, conventionally, of how to relate
election behavior to "demands". The choice made is not absolutely free
from any possible arbitrariness and something more complex might also be
Each player, like e.g. player 2, has the option of voting either for
Player 1 (behavior of probability a2f1) or for Player 3 (a2f3) or for not voting
for either of them (behavior described by n2 = 1 - a2f1 - a2f3 ). The model, in
the same manner as our previous model (of 2001), relates these
behavior-describing numbers (or probabilities) to a single demand parameter
called d2 which is all of the strategic choice by Player 2 that relates to stage
1 of the game. We specify in the model that positive numbers A2f1 and A2f3 are
determined (controlled by d2) and that a2fj
= A2fj/(1 + A2f1 + A2f3) where j is
either 1 or 3.
A2fj is specified to be Exp[
( q2j - d2 )/e3 ] where q2j is specified to be the calculated expectation of
payoff, to Player 2, under the hypothesis that the game has passed to stage 2
with Player 2 having become represented by Player j as his agent. So Player 2
chooses, strategically, the demand d2 which is interpretable as describing what
Player 2 "demands" he/she should expect in gains if either, (q21),
Player 1 becomes his representing agent or, (q23), Player 3 becomes the agent
representing Player 2 in stage 2 of the game.
Then the three strategy variables d1, d2, and d3 control the six behavior
probabilities a1f2, a1f3, a2f1, a2f3, a3f1, and a3f2 which completely describe
the actual (observable) behavior of the players in stage 1.
in the Model
In all we have 39 "strategic" variables in the model, 15
"demands" and 24 choices of "utility allocations". But we
replace all the demand variables, like d23 or d1, by associated controlled
behavior probabilities like a23 or a1f2 and a1f3. And then we arrive at simpler
equations with most of the appearance of exponential functions being eliminated.
It is a matter of practical considerations, how to find actual numerical
solutions of the resulting equations. There was experience with this problem
from work on the previous simpler model and much of the programming earlier
developed for use with MATHEMATICA for the finding of numerical solutions could
be adapted to the new model with its larger number of variables. With the help
of an NSF project assistant (AK) the 42 equations to be associated with the 42
unknown variables deriving from the model were derived and cross-checked.
At the present time there are some areas of example games for which
solutions are known and some where no explicit solution has yet been found. For
example, for a game with two symmetrically favored players who can already gain
a shared payoff of 3/4 just by themselves, no proper solution has yet been found
although illegitimate solutions with a variable assigned a game-theoretically
impermissible value have been found.
So it is possible that for this example that some of the
behavior-describing variables should assume extreme values or become irrelevant.
And then a reformulation of the set of equations to be solved may properly
handle this example.
We can illustrate the form of numerical results found for the case on the
borderline of where the nucleolus seems to treat the values of the two-person
coalition as "irrelevant alternatives" and where it depends strongly
on their values. This is the case where Players P1 and P2 are symmetrically
situated and have a coalition value of 1/3 by themselves (while the coalition of
all three players is worth 1 and no other two-person coalition has any value).
We can also illustrate, with a graphics of a sort first prepared last summer, how the Pareto efficiency of the game outcome (derived from the payoffs to the players) tends to improve as e3 tends towards zero. The effect of decreasing e3 is that the "demands" made by the players become effectively "more precise" and that leads to the improvement of the bargaining efficiency, as it were. These illustrative graphics were prepared (by AK and SL) on the basis of calculated solutions for a fully symmetric example game. We were able to find the first solutions for fully symmetric games more easily since the effect of that symmetry was to reduce the number of equations and variables from 42 to merely 7.
Results and Various Theories
The motivation for studying cooperative games via means for modeling the
actions of the players as actions possible in a non-cooperative game is,
ultimately, to shed some light on simple issues like that of assigning values
for the players of the game. This is a simple concept but it seems to be a
difficult and challenging problem.
Other researchers have also recently gotten into studies of the process
of negotiating a mode of cooperation. The work, for example, of Armando Gomes in
Philadelphia was interesting and for three person games his model was giving
either the nucleolus or the Shapley value as the evaluation result. Of course
any method which leads similarly to an "evaluation" of games becomes a
natural basis for an "arbitration scheme", using the words of Luce and
Our modeling, as it was also with the previous model, has three
parameters that describe the values of each of the two-player coalitions. When
these are small positive numbers, say of size less than 1/3, then the nucleolus
evaluation effectively disregards them although the Shapley value gives them a
moderate weighting in its evaluation of the game.
The results of the most recent computational studies using the model (in
which studies I am being assisted by the student SL working on the NSF grant for
the research project) are giving evaluation indications that are more equitably
distributed among the three players than those of the Shapley value. But when
the two person coalitions are quite weak and the nucleolus ignores them our
model does indicate some inequality in the payoffs to the players but less what
the Shapley value would indicate.
If two favored players can gain in their two-person coalition 2/3 of the
total available to the grand coalition (of all three players) then the
computational results from our model indicate significantly less payoff to the
two favored players than would be indicated either by the nucleolus or by the
It seems plausible that some refinements in modeling may allow the
players to more effectively cooperate in terms of the virtues of the
"alliance possibilities" involving subsets of the total set of
players. I have thought of some ideas here, of varying degrees of complexity. In
principle it is quite natural, returning to the analogy with biology and
evolution, that there should be some desirable behavior patterns that encourage
cooperative behavior on the part of other interacting biological
"players". (We already have "demands" in the modeling and it
is possible to design things, effectively, so that the players can
"demand" certain specific formsof cooperative behavior from other
players. For example, they might be enabled to "demand" that
acceptance probabilities should be more in reciprocal relationship. Since any
player is competent to be an agent and thus a leader it might be natural for a
player to "demand" that he/she should be elected to lead with
frequency like that of another player.)
It is notable in the specific model currently studied that when Player 3 seems to gain more than he/she might be expected to gain, when in a disfavored position in relation to the two-player coalition values, that part of HOW Player 3 seems to manage this is by having a rejecting rather than accepting behavior in relation to any elected agency representing a coalition of Players 1 and 2.
Our previous model turned out to give numerical results for cases of weak
two-player coalitions where, depending on the ratio of two "epsilons"
(comparable to "e3" as described above), the evaluation result could
be either "hyper-Shapleyan" (compared with the nucleolus at
(1/3,1/3,1/3)) or "sub-Shapleyan". Ultimately I came to the conclusion
that it was needed to have a MORE UNIFORM concept of "fuzziness" to be
applied in smoothing out the sharpness of the effects of "demand"
choices. Otherwise, and this became evident in earlier studies of two-player
models, if one player would have "sharper" demands and the other would
make "duller" demands then the player with the sharper demands would
indeed become like a "sharper" bargainer and would tend to gain an
advantage in the calculated game outcomes! The fuzziness is needed for
mathematical reasons, so that derivatives of smooth functions can be computed.
But under certain circumstances it seems that "unbalanced" fuzziness
can somehow "prejudice" the game evaluation objective.
Ultimately I thought of a "neutral" source of fuzziness by
means of supposing a general uncertainty about the actual boundary of the
Pareto-accessible utility outcomes.
Earlier, through the Summer of 2002, AK assisted in the work of deriving
and verifying the payoff functions and the equilibrium equations (for equilibria
in pure strategies with parameters describing probabilities of actions by the
players). Also AK managed to effectively set up usable programs for calculating
the nucleolus. These programs had been developed originally by S. Klauke in
More recently ,starting in the Fall, SL has been the student assistant
and has found sets of solutions for varying values of the adjustable parameters,
for example for when b3 = 2/3.