{\rtf1\ansi\deff0{\fonttbl{\f0\fmodern\fprq1\fcharset0 Courier New;}} {\colortbl ;\red0\green0\blue0;} \viewkind4\uc1\pard\cf1\lang1033\b\f0\fs32\par A Project Studying Cooperation in Games \par Through Modeling in Terms of Formally \par Non-Cooperative Action in a Repeated \par Game Context\par \par In 1966, at a summer "science camp" I gave a talk on the topic of the use of the "Prisoner's Dilemma" game, in a context of repetition and evolution of strategies, by theoretical biologists who were interested in studying the natural evolution of cooperative adaptations. And after giving the talk I thought more about the concept of studying a game by studying it as a repeated game and through this viewpoint I got an idea of how to eliminate all of the "verbal" complications that could become involved in the consideration of coalitions and coalition formation.\par In principle, coalitions, and specifically coalitions as considered by Von Neumann and Morgenstern in "Theory of Games and Economic Behavior", are things that could be \par implemented by contracts, like contracts in roman law. But of course a contract is quite intrinsically a "verbal" thing because indeed it could (or should!) be written down in words.\par My idea was that in a repeated game context that the players could be given the right to vote for "agencies" or "agents" among themselves. Thus at a first step a player, say player A, would have the option to accept player B as his agent. And the effect of this would be that the coalition (A,B) would be formed (like a committee with B as chairman) without any verbal processes \par occurring between A and B. Furthermore, this process adapts to successive steps of coalescence since if another step of elections is held then B, as the "agent" representing the coalition (A,B), can vote to accept the agency of player C and then C will thus become the agent representing the \par coalition (A,B,C).\par And in this manner, with generalized "agencies" being electable, the level of a "grand coalition" can always be reached (for a game of finitely many players), and as a consequence of that the requisites for "Pareto efficiency" will be available.\par \par \par More on Agencies\par \par Instead of there being unlimited means by which coalitions might actually be formed or form, dissolve, and reform, we must specify an election procedure that can form the basis for a game of non-cooperative form to be played by the players of the original game being translated. And in the context of studying a repeated game we can afford to prescribe that this election process is such that the agent is entirely uncommitted and his election is irrevocable for each specific playing of the game. (Of course election choices are expected to vary as the game is repeated.)\par A set of rules can be devised so that there are election stages in each of which all of the players remaining independent \par (not represented by another player as agent) \par have, each of them, the option of electing another player as an accepted agent. It \par is natural for these rules to require convergence so that no more than (n-1) \par stages of election will ever be needed for a game of n players.\par Election rules need to do something to resolve the impasse of a situation where A votes to accept B as his agent but B simultaneously also votes similarly for A. It is not exactly clear, a priori, which rule version handles these situations in the most preferable fashion, we have worked with more than one variant. When we more recently found, in the course of the use of specific model games for calculations, that it seemed to be desirable to allow elections to be repeated when an election had failed \par to result the election of any agency power, this finding had the effect of suggesting that election rules which had the effect that at most one agency could be elected at any stage of the election process would be most convenient. (And furthermore the actual results of calculations seem to indicate that this convention on elections is "asymptotically non-prejudicial" since the probability of simultaneous votes seemed to tend towards zero while the probability of "successful" elections tended towards +1.)\par If one begins with a quite general CF game and then if one introduces a game formally requiring that all coalition benefits must be attained through the means of the action of agents who have the authority to represent all the members of the coalition then the "agencies game" resulting from this still has the same validly derivable characteristic function as the original game. In essence \par the coalitions simply have the same potentialities as before, but in a formal sense, for these potentialities to be exploited, the members of a coalition would need TO CONSPIRE on a practical procedure \par for electing agents successively and finally taking the effective action of the coalition as an action of the agent finally elected \par to represent all of the members of that coalition.\par \par \par Current Form of Our Work \par \par We have found that it is possible to find "bargaining equilibria" in games of two or three players using models or modeling of this sort. And some problems or details of complication have appeared which have led to consideration of refinements in the models.\par Ideally, there could be evolved and developed an analogy to the mathematical methods (studies of PDE descriptions for \par the air flow, etc.) that are used in weather prediction. On a materialistic-commercial level this would naturally apply to the details of the deals in big corporation mergers (like of Pfizer and Pharmacia \par recently). Or also we can hope to get more insight relating to existing value or evaluation concepts such as, notably, the Shapley value or the nucleolus.\par This program of research has led, in connection with the approach to coalitions and cooperation via the model of agencies, to the consideration of varied models of \par how the players or participants can "react" to favorable or unfavorable behavior by other players in the repeated game and to the study of varied concepts of how the players may choose "demands" that regulate their reactive behavior. For example, in the analogous area of the study by theoretical \par biologists of the possibility of the evolution of cooperation in repeated games of PD type, there have been found various types of "reaction instinct" that can work to favor cooperation. Besides the simplest "tit for tat" instinct there can be more \par complex variants which may require more memory (of recent past game experiences in the sequence of repetitions). The analogous situation in terms of models with the election of agencies is that the "demands" that players could be allowed to make might be more or less elaborately structured.\par I am now working on a more complicated model (for three players) than that previously studied last year because an \par expedient used in that simpler model to reduce the number of equations and variables to be solved for seemed to lead to problems (on which more comments later).\par \par \par Common Features of Last Year's \par Model and the Current Model\par \par These models are the same in having a procedure of elections by means of which the agency powers are elected. In the previous model there was a simpler procedure employed after a first stage of agency elections had been \par effective in reducing the remaining set \par of active players to merely two. We used \par a sort natural bargaining scheme through \par which competing choices, by the remaining \par players, of utility allocations could be rationalized. However this device of simplification did not follow straightforwardly the basic concept of \par the agencies as players in the same sort \par of game interaction as for all the players at the beginning of the playing of the game. \par Both versions began with a first stage \par of elections where each of the three players could vote (or "choose") any specific one \par of the other two players to be his agent. \par In either version these voting or choosing actions (as repeatedly taken in the repeated game context) are described by numbers that represent, effectively, the BEHAVIOR \par involved, or the probability of the action \par being taken when the opportunity is presented.\par Thus we have an array of six numbers (and three "implied numbers") that describe this:\par \par For Player 1: For Player 2: For Player 3: \par a1f2 & a1f3 a2f1 & a2f3 a3f1 & a3f2\par \par And these represent the probabilities of specific voting choices. Thus, for example, a2f1 is the probability that Player 2 will choose (at the first stage of the game) to \par elect that Player 1 should be authorized \par to act as his agent (as if with "power of attorney"). So it is "acceptance rate \par of P2 for P1" in the concept underlying \par the notation.\par The other aifj variables have parallel meanings. And it is sometimes convenient to use another type of symbol like \par n3 = 1 - a3f1 - a3f2 which is the probability that Player 3 \par chooses NOT to accept either of Players 1 or 2 as his \par agent; or thus this is the probability that Player 3 \par votes "neither" or for himself.\par And since the three players make their first votes \par simultaneously there can be various outcomes. We have chosen \par a rule to simplify the process by which there is the proper \par response to the votes. If the number of acceptance votes \par cast by the players at the first voting opportunity is more \par than one then we apply a rule that only one of these votes, \par chosen at random, is to be accepted, so that the result of \par the election is either (1): that one of the players has \par elected some other player to be his agent or (2): that none \par of the players has voted to accept any other player to be an \par agent representing his interest. \par We have also introduced a convention that in case of a \par failure of the players to come together at all so that no \par agency has been elected that, with a certain probability, \par the first stage of elections can be repeated. This idea was \par also used in our prior model. The probability of the players being given another chance to elect an agent is called \par (1-e4) or "one minus epsilon sub 4" and the idea is that \par we want to study results of calculations as e4 tends \par asymptotically to zero. (It was found, in the case of the \par previous model, that as e4 would tend towards zero the \par probabilities like a1f2 would also tend towards zero but \par that agencies would be elected with high probability because \par of the continued offering of "second chances" for the basic \par action of coalescence (the election of an agency). \par \par \par The Second Stage of the Election of Agencies\par \par In our prior model, after one agency had been elected \par and only two active players remained we based the payoff \par on the utility assignments which had been chosen by the two \par remaining players, taking into consideration two numbers \par chosen by the player that had been elected as an agent and \par one number chosen by the remaining solo player. \par In the model currently being studied the approach is, \par in a sense, more "orthodox" with respect to the idea of \par agencies and all possibility of general cooperation is reduced to the idea of the final election of a "general \par agent". So when one player has accepted another as his \par agent there then remain two freely acting players and \par the level of cooperation corresponding to the "grand \par coalition" is not realized until one of them has elected \par the other as his agent.\par But if this final agency election fails to be realized \par then we can allow the existing agency to exploit the \par resources of the two player coalition formed by the two \par players involved. (And in a simple case, like we consider, \par this can lead simply to the use of the resources specified \par by the characteristic function for the coalition of that \par pair of players.)\par Similarly with the idea for the first stage of elections \par we allow the second stage to be repeated, with a probability \par of (1-e5), if neither side has elected the other side to \par agency power and the idea is that we want to study the \par limiting form of the results as e5 asymptotically goes to \par zero.\par Once a "general agent" has been elected then he/she has \par the privilege of being able to allocate payoffs, from the \par total of the payoff utility resources available, to all of \par the players, including himself. Our model has simply a \par resource total available of +1 which corresponds also to the \par Pareto boundary of the game. \par For each player there are four possible ways by which he \par could have become chosen as final agent. Either of two \par players may have initially elected him, and this leads to \par two cases, or the other two players may have initially \par combined in either of two ways followed by his being elected \par by the agent of that two player combination. And then as \par final agent he has essentially to choose a point in 2 \par dimensions to determine his choice of Pareto-accessible \par utility allocations.\par This leads to 8 dimensions for each player and there \par are three players so that this leads to 24 dimensions in \par all for the choices made by the players when they specify \par allocations of utility after being elected as "final agent". \par These are 24 out of a total of 39 "strategy variables" that \par are regarded as subject to the individual and individually \par optimizing choice by the players.\par And the other 15 dimensions of strategic choice by the \par players correspond to their choice options in relation to \par their reactive behavior (in the repeated game). The behavior \par of the players that is affected by or controlled by their \par reactive strategy choices is, in general, their "acceptance \par behavior".\par \par \par The Third Stage of the Game; \par the Allocation of Utility\par \par When two stages of agency election are complete then one \par of the original players has become the agent for all and he \par "allocates" the payoffs. These are presumed to be Pareto-\par efficient and so we suppose that he/she specifies three non-\par negative numbers with sum = +1. The information for this is \par specified by the amounts allocated to the other players and \par that is two numbers. This leads to 24 strategic choices in \par all, of this type, for all the players.\par For example, in cases of type UjBijRk player number i is \par in control and was first elected by Player j and then by \par player k and player i chooses to allocate the payoff of \par ujbijrk to player j (and ukbijrk to Player k, but that is \par another case of allocation strategy variables).\par Or for example, u1b3r21 is decided upon by Player 3, who \par was elected by Player 2 after Player 2 had been chosen by \par Player 1 in the first stage. This is the amount allocated to \par Player 1 and u2b3r21 would be the amount allocated to Player \par 2 (who has a different position in the history of the \par elections). Player 3 of course also allocates u3b3r21 to \par himself, but this is eliminated from our system of 42 \par equations in 42 variables because of its simple relation to \par the other two allocations by Player 3 here.\par So there are 24 "utility allocation" variables (which \par correspond to strategic choices by the players) and they \par group into the 4 categories of UjBijRk, UkBijRk, UjBiRjk, \par and UkBiRjk. \par \par \par "Demands" and Behavior in the \par Second Stage of the Game\par \par When the "second stage" is reached one player has become \par an agent, another player is represented through this agency, \par and a third player remains solo.\par Suppose that Player 1 now represents 2 and that 3 remains \par solo. We call a12, short for a12f3, the probability that 1 \par now chooses to vote for 3 as the final agent. (This is \par observable behavior on the part of Player 1 in the repeated \par game.) And we call af12, short for a3f12, the complimentary \par probability that Player 3 will vote to accept Player 1 (who \par already represents Player 2) as the final agent. These \par classifications lead to 12 numbers, six of each type.\par And the 12 numbers ARE NOT "strategic" choices by \par the players involved, rather we arrange that they are \par determined by "reactive behavior" regulated by "demands" \par which are the actual strategic choices for the players.\par For example a12 (or a12f3) is specified to be A12/(1+A12) \par where A12 will be a positive number. This makes a12 \par a positive number less than +1. And the quantity \par A12 controlling a12 is specified to be A12 = \par Exp[ (u1b3r12 - d12)/e3 ] .\par Here e3, or "epsilon sub 3", is intended to be made \par ultimately very small, as we study the equilibria of the \par model. That smallness will make A12 react sharply as d12 and \par u1b3r12 vary in relation to each other. The number "d12" is \par the "demand" chosen (strategically) by Player 1 in relation \par to this situation where he can vote to accept Player 3 as \par general (final) agent or alternatively wait and hope that \par Player 3 will accept him instead (!). And what the formula \par takes into consideration is simply the prospective gain or \par payoff to Player 1 in the case where Player 3 becomes \par general agent and previously Player 1 had been elected to \par represent Player 2, and this is specifically u1b3r12.\par There are 6 demand strategy numbers like d12 (controlling \par a12). And there are also six quite analogous strategy \par choices like, for example, df23 controlling af23 (or a1f23). \par So "df23" is a choice by player 1, since it controls a1f23 \par or the probability of the acceptance by Player 1, as a solo \par player in stage 2 of a game, of the agency of Player 2 when \par Player 2 is already representing Player 3. \par So we have af23 = AF23/(1+AF23) or \par a1f23 = A1F23/(1+A1F23) with the relation of \par AF23 = Exp[ (u1b23r1 - df23)/e3 ] being specified \par for the control of the acceptance behavior by the \par (strategic) demand choice. Or this could be called, \par in longer notation, A1F23 = Exp[ (u1b23r1 - d1f23)/e3 ] .\par \par \par Demands and Acceptance Behavior at Stage One \par \par At the first stage of elections, when all three players \par are "solo" we have made a choice, conventionally, of how to \par relate election behavior to "demands". The choice made is \par not absolutely free from any possible arbitrariness and \par something more complex might also be considered appropriate.\par Each player, like e.g. player 2, has the option of voting \par either for Player 1 (behavior of probability a2f1) or for \par Player 3 (a2f3) or for not voting for either of them \par (behavior described by n2 = 1 - a2f1 - a2f3 ). \par The model, in the same manner as the previously studied \par model, relates these behavior-describing numbers (or \par probabilities) to a single demand parameter called d2 which \par is all of the strategic choice by Player 2 that relates to \par stage 1 of the game. We specify in the model that positive \par numbers A2f1 and A2f3 are determined (controlled by d2) \par and that a2fj = A2fj/(1 + A2f1 + A2f3) where j is either \par 1 or 3. \par A2fj is specified to be Exp[ (q2j - d2)/e3 ] where q2j is \par specified to be the calculated expectation of payoff, to \par Player 2, under the hypothesis that the game has passed to \par stage 2 with Player 2 having become represented by Player j \par as his agent. So Player 2 chooses, strategically, the demand \par d2 which is interpretable as describing what Player 2 \par "demands" he/she should expect in gains if either, (q21), \par Player 1 becomes his representing agent or, (q23), Player 3 \par becomes the agent representing Player 2 in stage 2 of \par the game.\par Then the three strategy variables, d1, d2, and d3, \par control the six behavior probabilities a1f2, a1f3, a2f1, \par a2f3, a3f1, and a3f2 which completely describe the actual \par (observable) behavior of the players in stage 1.\par \par \par Variables in the Model\par \par In all we have 39 "strategic" variables in the model, 15 \par "demands" and 24 choices of "utility allocations". But we \par replace all the demand variables, like d23 or d1, by \par associated controlled behavior probabilities like a23 or \par a1f2 and a1f3. And then we arrive at simpler equations \par with most of the appearance of exponential functions being \par eliminated.\par It is a matter of practical considerations, how to find \par actual numerical solutions of the resulting equations. There \par is experience with this problem from work on the previous \par simpler model but for the current model the work is far from \par complete but has arrived at a reliable derivation of the \par actual equations to be solved (in good form for study with \par applicable computer software) with the aid of the work of \par an NSF project assistant (AK). Furthermore some numerical \par calculations first for completely symmetric game and then for general non-symmetric games give indications that the model is working as might be hoped, at least for small values of the 2-player coalition strengths. But it is too early to really venture any comments about the asymptotic\par patterns of the solutions or to interpret the results of\par calculations.\par When enough calculations have been made for a variety of non-symmetric games then the implied values inferred, via the calculated payoffs may be values that can be compared with various relevant concepts, like the nucleolus, etc.\par So the challenge remains to be dealt with of actually \par finding enough of the instructively revealing results from \par numerical solutions for equilibria of the model.\par \par \par Comparisons from Results\par \par The model is designed so that the game can be of a \par type where the Shapley value and the nucleolus give quite \par different "evaluations" of the game. Which of these, for \par example as a guide usable in an "arbitration scheme", is \par better or worse (if such comparisons could be valid in any \par sense)? Naturally, any other route which leads, however it \par does so, to an "evaluation", can lead to a basis for \par comparison of these, and of other approaches that give \par evaluations.\par Our modeling, as also with the previous model, has three \par parameters that describe the resources available to each of \par the two-player coalitions. If these are small positive \par numbers, say of size less than 1/3, then the standard \par nucleolus evaluation effectively disregards them although \par the Shapley value gives them a modest weighting in its \par evaluation of the game.\par Our previous model turned out to give numerical results \par in cases of this sort where, depending on the ratio of two \par "epsilons" (comparable to "e3" as described above) the \par evaluation could be either "hyper-Shapleyan" (compared with \par the nucleolus at \{1/3,1/3,1/3\}) or "sub-Shapleyan".\par Ultimately I came to the conclusion that it was needed \par to have a MORE UNIFORM concept of "fuzziness" to be applied \par in smoothing out the sharpness of the effects of "demand" \par choices. Otherwise, and this became evident in earlier \par studies of two-player models, if one player would have \par "sharper" demands and the other would make "duller" demands \par then the player with the sharper demands would indeed \par become like a "sharper" bargainer and would tend to gain \par an advantage in the calculated game outcomes! The fuzziness \par is needed for mathematical reasons, so that derivatives of \par smooth functions can be computed. But under certain \par circumstances it seems that "unbalanced" fuzziness can \par somehow "prejudice" the game evaluation objective.\par \par \par Associated Studies\par \par The work on this project which seeks to exploit the \par "agencies" concept to successfully study games with \par cooperation via a reduction to non-cooperative equilibrium \par considerations has led me also to study some other topics.\par One of these was the computability of the nucleolus, \par since if a variety of sample games are studied, and if these \par are of the sort where the classical nucleolus and the \par Shapley value are defined, then just for the comparison \par of numerical results it is nice to have a quick method \par for finding the numbers that form the nucleolus vector. \par It occurred to me that that should be possible by a \par programmable Monte Carlo style of method by which random \par perturbations would be used in finding, to a high level \par of approximation, the numerical values of the nucleolus \par components. \par I tried programming that for use in MATHEMATICA knowing \par that often the game would be defined by rational numbers and \par that a close approximation to the exact nucleolus would lead \par to finding the exact answer by finding the simple rational \par numbers for which approximations had been found.\par In the method of successive approximations the actual \par definition of the nucleolus would be the basis for the \par criterion for the comparison of the merits of various random \par perturbations made to an approximate vector.\par Later I learned of the work of Sven Klauke in Bielefeld \par where this sort of a method has been the basis of an \par effective program in C++ and where there has also been \par developed an effective program that works by the method of \par reducing the problem to problems of a "linear programming" \par type.\par And after being at the Stony Brook meeting of last year I \par thought about the characteristic function as calculated by \par Harsanyi (this was around 1960) in connection with \par developing a general solution concept \par for cooperative games. And then I learned that if this \par approach were used to transform the interpretation of \par a game of three players and if after that the nucleolus were \par calculated for the (constant sum) game thus obtained that \par this would result in THE SAME vector as that of the Shapley \par value for the game (which itself would not be changed by the \par move to the "Harsanyi characteristic function").\par On the other hand, if we consider similar types of four \par person games then the nucleolus computed after transforming \par the coalition values information via the Harsanyi \par characteristic function will not, in general, match the \par Shapley value vector.\par And now this seems to me curiously parallel to how Von \par Neumann, in 1928, concluded that 3-person cooperative games \par could be, somehow, evaluated, but that 4-person games were \par of a different order of difficulty, in terms of evaluation!\par \par \par \par \par }