A Project Studying Cooperation

     in Games Via Modeling in Terms of

      Formally Non-Cooperative Action

        in a Repeated Game Context

 

   In 1996, at a summer "science camp" I gave a talk on

the topic of the use of the "Prisoner's Dilemma" game, in

a context of repetition and evolution of strategies, by

theoretical biologists who were interested in studying

the natural evolution of cooperative adaptations. And

after giving the talk I thought more about the concept

of studying a game by studying it as a repeated game and

through this viewpoint I got an idea of how to eliminate

all of the "verbal" complications that could become

involved in the consideration of coalitions and coalition

formation.

   In principle, coalitions, and specifically coalitions

as considered by Von Neumann and Morgenstern in "Theory

of Games and Economic Behavior", are things that could

be implemented by contracts, like contracts in roman

law. But of course a contract is quite intrinsically

a "verbal" thing because indeed it could (or should!)

be written down in words.

   My idea was that in a repeated game context that the

players could be given the right to vote for "agencies"

or "agents" among themselves. Thus at a first step a

player, say player A, would have the option to accept

player B as his agent. And the effect of this would

be that the coalition (A,B) would be formed (like

a committee with B as chairman) without any verbal

processes occurring between A and B. Furthermore, this

process adapts to successive steps of coalescence since

if another step of elections is held then B, as the

"agent" representing the coalition (A,B), can vote

to accept the agency of player C and then C will thus

become the agent representing the coalition (A,B,C).

   And in this manner, with generalized "agencies" being

electable, the level of a "grand coalition" can always

be reached (for a game of finitely many players), and

as a consequence of that the requisites for "Pareto

efficiency" will be available.

 

 

            More on Agencies

 

   Instead of there being unlimited means by which

coalitions might actually be formed or form, dissolve,

and reform, we must specify an election procedure that

can form the basis for a game of non-cooperative form

to be played by the players of the original game being

translated. And in the context of studying a repeated

game we can afford to prescribe that this election

process is such that the agent is entirely uncommitted

and his election is irrevocable for each specific playing

of the game. (Of course election choices are expected

to vary as the game is repeated.)

   A set of rules can be devised so that there are

election stages in each of which all of the players

remaining independent (not represented by another player

as agent) have, each of them, the option of electing

another player as an accepted agent. It is natural for

these rules to require convergence so that no more than

(n-1) stages of election will ever be needed for a game

of n players.

   Election rules need to do something to resolve the

impasse of a situation where A votes to accept B as his

agent but B simultaneously also votes similarly for A.

It is not exactly clear, a priori, which rule version

handles these situations in the most preferable fashion,

we have worked with more than one variant. When we more

recently found, in the course of the use of specific

model games for calculations, that it seemed to be

desirable to allow elections to be repeated when an

election had failed to result in the election of any

agency power, this finding had the effect of suggesting

that election rules which had the effect that at most one

agency could be elected at any stage of the election

process would be most convenient. (And furthermore the

actual results of calculations seem to indicate that

this convention on elections is "asymptotically non-

prejudicial" since the probability of simultaneous

votes seemed to tend towards zero while the probability

of "successful" elections tended towards +1.)

   If one begins with a quite general CF game and then

if one introduces a game formally requiring that all

coalition benefits must be attained through the means of

the action of agents who have the authority to represent

all the members of the coalition then the "agencies game"

resulting from this still has the same validly derivable

characteristic function as the original game. In essence

the coalitions simply have the same potentialities as

before, but in a formal sense, for these potentialities

to be exploited, the members of a coalition would need

TO CONSPIRE on a practical procedure for electing agents

successively and finally taking the effective action of

the coalition as an action of the agent finally elected

to represent all of the members of that coalition.

 

 

          Current Form of Our Work

 

   We have found that it is possible to find "bargaining

equilibria" in games of two or three players using models

or modeling of this sort. Ideally, there could be evolved

and developed an analogy to the mathematical methods

(studies of PDE descriptions for the air flow, etc.)

that are used in weather prediction. On a materialistic-

commercial level this would naturally apply to the

details of the deals in big corporation mergers (like

of Pfizer and Pharmacia recently). Or also we can hope to

get more insight relating to existing value or evaluation

concepts such as, notably, the Shapley value or the

nucleolus.

   This program of research has led, in connection with

the approach to coalitions and cooperation via the

concept of agencies, to the consideration of varied

models of how the players or participants can "react"

to favorable or unfavorable behavior by other players in

the repeated game and to the study of varied concepts of

how the players may choose "demands" that regulate their

reactive behavior. For example, in the analogous area of

the study by theoretical biologists of the possibility of

the evolution of cooperation in repeated games of PD

type, there have been found various types of "reaction

instinct" that can work to favor cooperation. Besides the

simplest "tit for tat" instinct there can be more complex

variants which may require more memory (of recent past

game experiences in the sequence of repetitions). The

analogous situation in terms of models with the election

of agencies is that the "demands" that players could

be allowed to make might be more or less elaborately

structured.

   I have been studying a more complicated model (for

three players) than that previously studied (in 2001)

because an expedient used in that simpler model to reduce

the number of equations and variables to be solved for

seemed to lead to problems. The payoff results seemed to

depend on the comparative smallnesses of two different

parameters, both of which were supposed to be decreased

asymptotically to zero.

 

 

       Common Features of Last Year's

       Model and the Current Model

 

   These models are the same in having a procedure

of elections by means of which the agency powers are

elected. In the previous model there was a simpler

procedure employed after a first stage of agency

elections had been effective in reducing the remaining

set of active players to merely two. We used a sort

of natural bargaining scheme through which competing

choices, by the remaining players or agents, of utility

allocations could be rationalized. However this device

of simplification did not follow straightforwardly the

basic concept of the agencies as players in the same sort

of game interaction as that for all the players at the

beginning of the playing of the game.

   Both versions begin with a first stage of elections

where each of the three players could vote (or "choose")

any specific one of the other two players to be his

agent. In either version these voting or choosing actions

(as repeatedly taken in the repeated game context) are

described by numbers that represent, effectively, the

BEHAVIOR involved, or the probability of the action being

taken when the opportunity is presented.

   Thus we have an array of six numbers (and three

"implied numbers") that describe this:

 

  For Player 1:     For Player 2:      For Player 3:

 

   a1f2 & a1f3       a2f1 & a2f3        a3f1 & a3f2

    

   And these represent the probabilities of specific

voting choices. Thus, for example, a2f1 is the

probability that Player 2 will choose (at the first stage

of the game) to elect that Player 1 should be authorized

to act as his agent (as if with "power of attorney"). So

it is the "acceptance rate of P2 for P1" in the concept

underlying the notation.

   The other aifj variables have parallel meanings. And

it is sometimes convenient to use another type of symbol

like n3 = 1 - a3f1 - a3f2 which is the probability that

Player 3 chooses NOT to accept either of Players 1 or 2

as his agent; or thus this is the probability that Player

3 votes "neither" or for himself.

   And since the three players make their first votes

simultaneously there can be various outcomes. We have

chosen a rule to simplify the process by which there is

the proper response to the votes. If the number of

acceptance votes cast by the players at the first voting

opportunity is more than one then we apply a rule that

only one of these votes, chosen at random, is to be

accepted, so that the result of the election is either

(1): that one of the players has elected some other

player to be his agent or (2): that none of the players

has voted to accept any other player to be an agent

representing his interest.

   We have also introduced a convention that in case of a

failure of the players to come together at all so that no

agency has been elected that, with a certain probability,

the first stage of elections can be repeated. This idea

was also used in our prior model. The probability of the

players being given another chance to elect an agent is

called (1-e4) or "one minus epsilon sub 4" and the idea

is that we want to study results of calculations as e4

tends asymptotically to zero. (It was found, in the case

of the previous model, that as e4 would tend towards zero

the probabilities like a1f2 would also tend towards zero

but that agencies would be elected with high probability

because of the continued offering of "second chances"

for the basic action of coalescence (the election of an

agency).)

 

 

           The Second Stage of the

            Election of Agencies

 

   In our prior model, after one agency had been elected

and only two active players remained, we based the payoff

on the utility assignments which had been chosen by the

two remaining players, taking into consideration two

numbers chosen by the player that had been elected as

an agent and one number chosen by the remaining solo

player. In the model currently being studied the approach

is, in a sense, more "orthodox" with respect to the idea

of agencies and all possibility of general cooperation is

reduced to the idea of the final election of a "general

agent". So when one player has accepted another as his

agent there then remain two freely acting players and

the level of cooperation corresponding to the "grand

coalition" is not realized until one of them has elected

the other as his agent.

   But if this final agency election fails to be realized

then we can allow the existing agency to exploit the

resources of the two player coalition formed by the

two players involved. (And in a simple case, like we

consider, this can lead simply to the use of the

resources specified by the characteristic function

for the coalition of that pair of players.

   Similarly with the idea for the first stage of

elections we allow the second stage to be repeated, with

a probability of (1-e5), if neither side has elected

the other side to agency power, and the idea is that we

want to study the limiting form of the results as e5

asymptotically goes to zero.

   Once a "general agent" has been elected then he/she

has the privilege of being able to allocate payoffs,

from the total of the payoff utility resources available,

to all of the players, including himself. Our model has

simply a resource total available of +1 which corresponds

also to the Pareto boundary of the game.

   For each player there are four possible ways by which

he could have become chosen as final agent. Either of two

players may have initially elected him, and this leads to

two cases, or the other two players may have initially

combined in either of two ways followed by his being

elected by the agent of that two player combination. And

then as final agent he has essentially to choose a point

in 2 dimensions to determine his choice of Pareto-

accessible utility allocations.

   This leads to 8 dimensions for each player and there

are three players so that this leads to 24 dimensions in

all for the choices made by the players when they specify

allocations of utility after being elected as "final

agent". These are 24 out of a total of 39 "strategy

variables" that are regarded as  subject to the

individual and individually optimizing choices by

the players.

    And the other 15 dimensions of strategic choice

by the players correspond to their choice options in

relation to their reactive behavior (in the repeated

game). The behavior of the players that is affected by

or controlled by their reactive strategy choices is,

in general, their "acceptance behavior".


         The Third Stage of the Game;

          the Allocation of Utility

 

   When two stages of agency election are complete then

one of the original players has become the agent for all

and he "allocates" the payoffs. These are presumed to be

Pareto-efficient and so we suppose that he/she specifies

three non-negative numbers with sum = +1. The information

for this is specified by the amounts allocated to the

other players and that is two numbers. This leads to 24

strategic choices in all, of this type, for all the

players.

   For example, in cases of type UjBijRk player number i

is in control and was first elected by Player j and then

by player k and player i chooses to allocate the payoff

of ujbijrk to player j (and ukbijrk to Player k, but that

is another case of allocation strategy variables).

   Or for example, u1b3r21 is decided upon by Player 3,

who was elected by Player 2 after Player 2 had been

chosen by Player 1 in the first stage. This is the amount

allocated to Player 1 and u2b3r21 would be the amount

allocated to Player 2 (who has a different position in

the history of the elections). Player 3 of course also

allocates u3b3r21 to himself, but this is eliminated from

our system of 42 equations in 42 variables because of its

simple relation to the other two allocations by Player 3

here.

   So there are 24 "utility allocation" variables (which

correspond to strategic choices by the players) and they

group into the 4 categories of UjBijRk, UkBijRk, UjBiRjk,

and UkBiRjk.
 

  

         "Demands" and Behavior in the

           Second Stage of the Game

 

   When the "second stage" is reached one player has

become an agent, another player is represented through

this agency, and a third player remains solo.

   Suppose that Player 1 now represents 2 and that

3 remains solo. We call a12, short for a12f3, the

probability that 1 now chooses to vote for 3 as the

final agent. (This is observable behavior on the part

of Player 1 in the repeated game.) And we call af12,

short for a3f12, the complimentary probability that

Player 3 will vote to accept Player 1 (who already

represents Player 2) as the final agent. These

classifications lead to 12 numbers, six of each type.

   And the 12 numbers ARE NOT "strategic" choices by

the players involved, rather we arrange that they are

determined by "reactive behavior" regulated by "demands"

which are the actual strategic choices for the players.

   For example a12 (or a12f3) is specified to be

A12/(1+A12) where A12 will be a positive number.

This makes a12 a positive number less than +1. And

the quantity A12 controlling a12 is specified to be

A12 = Exp[ (u1b3r12 - d12)/e3 ] .

   Here e3, or "epsilon sub 3", is intended to be made

ultimately very small, as we study the equilibria of the

model. That smallness will make A12 react sharply as d12

and u1b3r12 vary in relation to each other.  The number

"d12" is the "demand" chosen (strategically) by Player 1

in relation to this situation where he can vote to accept

Player 3 as general (final) agent or alternatively wait

and hope that Player 3 will accept him instead (!). And

what the formula takes into consideration is simply the

prospective gain or payoff to Player 1 in the case where

Player 3 becomes general agent and previously Player 1

had been elected to represent Player 2, and this is

specifically u1b3r12.

   There are 6 demand strategy numbers of the same type

as d12 (which controls a12). And there are also six quite

analogous strategy choices like, for example, df23

controlling af23 (or a1f23).

   In the second stage of agency elections an acceptance

can be made either by a single player accepting the

coalition led by another player (and thus that other

player's leadership as final general agent). Or on the

other hand the player leading a coalition of two may

accept the solo player as the final agent for all.

   "df23" is a choice by player 1, since it is structured

to control a1f23 or the probability of the acceptance by

Player 1, as a solo player in stage 2 of a game, of the

agency of Player 2 when Player 2 is already representing

Player 3.

   So we have  af23 = AF23/(1+AF23)  or

a1f23 = A1F23/(1+A1F23)  with the relation of

AF23 = Exp[ (u1b23r1 - df23)/e3 ]  being specified

for the control of the acceptance behavior af23 by

the (strategic) demand choice df23. Or this could be

called, in longer notation, 

A1F23 = Exp[ (u1b23r1-d1f23)/e3 ] .

 

 

           Demands and Acceptance

           Behavior at Stage One

 

   At the first stage of elections, when all three

players are "solo" we have made a choice, conventionally,

of how to relate election behavior to "demands". The

choice made is not absolutely free from any possible

arbitrariness and something more complex might also be

considered appropriate.

   Each player, like e.g. player 2, has the option of

voting either for Player 1 (behavior of probability a2f1)

or for Player 3 (a2f3) or for not voting for either of

them (behavior described by n2 = 1 - a2f1 - a2f3 ).

The model, in the same manner as our previous model

(of 2001), relates these behavior-describing numbers (or

probabilities) to a single demand parameter called d2

which is all of the strategic choice by Player 2 that

relates to stage 1 of the game. We specify in the model

that positive numbers A2f1 and A2f3 are determined

(controlled by d2) and that  a2fj = A2fj/(1 + A2f1 +

A2f3)  where j is either 1 or 3.

   A2fj is specified to be  Exp[ ( q2j - d2 )/e3 ] where

q2j is specified to be the calculated expectation of

payoff, to Player 2, under the hypothesis that the game

has passed to stage 2 with Player 2 having become

represented by Player j as his agent. So Player 2

chooses, strategically, the demand d2 which is

interpretable as describing what Player 2 "demands"

he/she should expect in gains if either, (q21), Player 1

becomes his representing agent or, (q23), Player 3

becomes the agent representing Player 2 in stage 2 of the

game.

   Then the three strategy variables d1, d2, and d3

control the six behavior probabilities a1f2, a1f3, a2f1,

a2f3, a3f1, and a3f2 which completely describe the actual

(observable) behavior of the players in stage 1.

 

 

           Variables in the Model

 

   In all we have 39 "strategic" variables in the model,

15 "demands" and 24 choices of "utility allocations". But

we replace all the demand variables, like d23 or d1, by

associated controlled behavior probabilities like a23 or

a1f2 and a1f3. And then we arrive at simpler equations

with most of the appearance of exponential functions

being eliminated.

   It is a matter of practical considerations, how

to find actual numerical solutions of the resulting

equations. There was experience with this problem from

work on the previous simpler model and much of the

programming earlier developed for use with MATHEMATICA

for the finding of numerical solutions could be adapted

to the new model with its larger number of variables.

With the help of an NSF project assistant (AK) the 42

equations to be associated with the 42 unknown variables

deriving from the model were derived and cross-checked.

   At the present time there are some areas of example

games for which solutions are known and some where no

explicit solution has yet been found. For example, for

a game with two symmetrically favored players who can

already gain a shared payoff of 3/4 just by themselves,

no proper solution has yet been found although

illegitimate solutions with a variable assigned a game-

theoretically impermissible value have been found.

   So it is possible that for this example that some of

the behavior-describing variables should assume extreme

values or become irrelevant. And then a reformulation of

the set of equations to be solved may properly handle

this example.

   We can illustrate the form of numerical results found

for the case on the borderline of where the nucleolus

seems to treat the values of the two-person coalition as

"irrelevant alternatives" and where it depends strongly

on their values. This is the case where Players P1 and P2

are symmetrically situated and have a coalition value of

1/3 by themselves (while the coalition of all three

players is worth 1 and no other two-person coalition has

any value).

   We can also illustrate, with a graphics of a sort

first prepared last summer, how the Pareto efficiency

of the game outcome (derived from the payoffs to the

players) tends to improve as e3 tends towards zero. The

effect of decreasing e3 is that the "demands" made by the

players become effectively "more precise" and that leads

to the improvement of the bargaining efficiency, as it

were. These illustrative graphics were prepared (by AK

and SL) on the basis of calculated solutions for a fully

symmetric example game. We were able to find the first

solutions for fully symmetric games more easily since

the effect of that symmetry was to reduce the number

of equations and variables from 42 to merely 7.

 

 

     Comparing Results and Various Theories

 

   The motivation for studying cooperative games via

means for modeling the actions of the players as actions

possible in a non-cooperative game is, ultimately, to

shed some light on simple issues like that of assigning

values for the players of the game. This is a simple

concept but it seems to be a difficult and challenging

problem.

   Other researchers have also recently gotten into

studies of the process of negotiating a mode of

cooperation. The work, for example, of Armando Gomes in

Philadelphia was interesting and for three person games

his model was giving either the nucleolus or the Shapley

value as the evaluation result. Of course any method

which leads similarly to an "evaluation" of games becomes

a natural basis for an "arbitration scheme", using the

words of Luce and Raiffa.

   Our modeling, as it was also with the previous model,

has three parameters that describe the values of each of

the two-player coalitions. When these are small positive

numbers, say of size less than 1/3, then the nucleolus

evaluation effectively disregards them although the

Shapley value gives them a moderate weighting in its

evaluation of the game.

   The results of the most recent computational studies

using the model (in which studies I am being assisted by

the student SL working on the NSF grant for the research

project) are giving evaluation indications that are more

equitably distributed among the three players than those

of the Shapley value. But when the two person coalitions

are quite weak and the nucleolus ignores them our model

does indicate some inequality in the payoffs to the

players but less what the Shapley value would indicate.  

   If two favored players can gain in their two-person

coalition 2/3 of the total available to the grand

coalition (of all three players) then the computational

results from our model indicate significantly less payoff

to the two favored players than would be indicated either

by the nucleolus or by the Shapley value.

   It seems plausible that some refinements in modeling

may allow the players to more effectively cooperate in

terms of the virtues of the "alliance possibilities"

involving subsets of the total set of players. I have

thought of some ideas here, of varying degrees of

complexity. In principle it is quite natural, returning

to the analogy with biology and evolution, that there

should be some desirable behavior patterns that encourage

cooperative behavior on the part of other interacting

biological "players". (We already have "demands" in

the modeling and it is possible to design things,

effectively, so that the players can "demand" certain

specific forms of cooperative behavior from other

players. For example, they might be enabled to "demand"

that acceptance probabilities should be more in

reciprocal relationship. Since any player is competent to

be an agent and thus a leader it might be natural for a

player to "demand" that he/she should be elected to lead

with frequency like that of another player.)

   It is notable in the specific model currently studied

that when Player 3 seems to gain more than he/she might

be expected to gain, when in a disfavored position in

relation to the two-player coalition values, that part

of HOW Player 3 seems to manage this is by having a

rejecting rather than accepting behavior in relation to

any elected agency representing a coalition of Players 1

and 2.


                 A Technical Issue

 

   Our previous model turned out to give numerical

results for cases of weak two-player coalitions where,

depending on the ratio of two "epsilons" (comparable to

"e3" as described above), the evaluation result could

be either "hyper-Shapleyan" (compared with the nucleolus

at (1/3,1/3,1/3)) or "sub-Shapleyan".

   Ultimately I came to the conclusion that it was needed

to have a MORE UNIFORM concept of "fuzziness" to be

applied in smoothing out the sharpness of the effects

of "demand" choices. Otherwise, and this became evident

in earlier studies of two-player models, if one player

would have "sharper" demands and the other would make

"duller" demands then the player with the sharper

demands would indeed become like a "sharper" bargainer

and would tend to gain an advantage in the calculated

game outcomes! The fuzziness is needed for mathematical

reasons, so that derivatives of smooth functions can be

computed. But under certain circumstances it seems that

"unbalanced" fuzziness can somehow "prejudice" the game

evaluation objective.

   Ultimately I thought of a "neutral" source of

fuzziness by means of supposing a general uncertainty

about the actual boundary of the Pareto-accessible

utility outcomes. 

 

 

            NSF Project Assistants

 

   Earlier, through the Summer of 2002, AK assisted in

the work of deriving and verifying the payoff functions

and the equilibrium equations (for equilibria in pure

strategies with parameters describing probabilities of

actions by the players). Also AK managed to effectively

set up usable programs for calculating the nucleolus.

These programs had been developed originally by S. Klauke

in Bielefeld.

   More recently, starting in the Fall, SL has been the

student assistant and has found sets of solutions for

varying values of the adjustable parameters, for example

for when b3 = 2/3.