Potential game - Alchetron, The Free Social Encyclopedia

In game theory, a game is said to be a potential game if the incentive of all players to change their strategy can be expressed using a single global function called the potential function. Robert W. Rosenthal created the concept of a congestion game in 1973. Dov Monderer and Lloyd Shapley created the concept of a potential game and proved that every congestion game is a potential game.

The properties of several types of potential games have since been studied. Games can be either ordinal or cardinal potential games. In cardinal games, the difference in individual payoffs for each player from individually changing one's strategy ceteris paribus has to have the same value as the difference in values for the potential function. In ordinal games, only the signs of the differences have to be the same.

The potential function is a useful tool to analyze equilibrium properties of games, since the incentives of all players are mapped into one function, and the set of pure Nash equilibria can be found by locating the local optima of the potential function. Convergence and finite-time convergence of an iterated game towards a Nash equilibrium can also be understood by studying the potential function.

Definition

We will define some notation required for the definition. Let N be the number of players, A the set of action profiles over the action sets A i of each player and u be the payoff function.

A game G = ( N , A = A 1 × … × A N , u : A → R N ) is:

an exact potential game if there is a function Φ : A → R such that ∀ a − i ∈ A − i , ∀ a i ′ , a i ″ ∈ A i ,

a weighted potential game if there is a function Φ : A → R and a vector w ∈ R + + N such that ∀ a − i ∈ A − i , ∀ a i ′ , a i ″ ∈ A i ,

an ordinal potential game if there is a function Φ : A → R such that ∀ a − i ∈ A − i , ∀ a i ′ , a i ″ ∈ A i ,

a generalized ordinal potential game if there is a function Φ : A → R such that ∀ a − i ∈ A − i , ∀ a i ′ , a i ″ ∈ A i ,

a best-response potential game if there is a function Φ : A → R such that ∀ i ∈ N , ∀ a − i ∈ A − i ,

where b i ( a − i ) is the best payoff for player i given a − i .

A simple example

In a 2-player, 2-strategy game with externalities, individual players' payoffs are given by the function u_i(s_i, s_j) = b_i s_i + w s_i s_j, where s_i is players i's strategy, s_j is the opponent's strategy, and w is a positive externality from choosing the same strategy. The strategy choices are +1 and −1, as seen in the payoff matrix in Figure 1.

This game has a potential function P(s₁, s₂) = b₁ s₁ + b₂ s₂ + w s₁ s₂.

If player 1 moves from −1 to +1, the payoff difference is Δu₁ = u₁(+1, s₂) – u₁(–1, s₂) = 2 b₁ + 2 w s₂.

The change in potential is ΔP = P(+1, s₂) – P(–1, s₂) = (b₁ + b₂ s₂ + w s₂) – (–b₁ + b₂ s₂ – w s₂) = 2 b₁ + 2 w s₂ = Δu₁.

The solution for player 2 is equivalent. Using numerical values b₁ = 2, b₂ = −1, w = 3, this example transforms into a simple battle of the sexes, as shown in Figure 2. The game has two pure Nash equilibria, (+1, +1) and (−1, −1). These are also the local maxima of the potential function (Figure 3). The only stochastically stable equilibrium is (+1, +1), the global maximum of the potential function.

A 2-player, 2-strategy game cannot be a potential game unless

[ u 1 ( + 1 , − 1 ) + u 1 ( − 1 , + 1 ) ] − [ u 1 ( + 1 , + 1 ) + u 1 ( − 1 , − 1 ) ] = [ u 2 ( + 1 , − 1 ) + u 2 ( − 1 , + 1 ) ] − [ u 2 ( + 1 , + 1 ) + u 2 ( − 1 , − 1 ) ]

Equilibrium selection

The existence of pure strategy Nash equilibrium is guaranteed in potential games, and multiple Nash equilibria may exist. Learning algorithms such as "best response" and "better response" can only guarantee that the iterative learning process can converge to one of the Nash equilibria (if multiple). Equilibrium selective learning algorithms aim to design a strategy where convergence to the best Nash equilibrium, with respect to the potential function, is guaranteed. In, the authors propose an equilibrium selective algorithm named MaxLogit, which provably converges to the best Nash equilibrium at the fastest speed in its class, using mixing rate analysis of induced Markovian chains. In a special case where every player shares the same objective function (hence the potential function), and possibly the same action set, the problem is equivalent to distributed combinatorial optimization which arises in many engineering applications. Equilibrium selective learning algorithms such as MaxLogit can be used in such combinatorial optimizations, even in a distributed fashion.

Bounded rational models

A logit equilibrium, the Gibbs measure from statistical mechanics, was shown to be the equilibrium of a finite-player potential game, where players are assumed to be bounded-rational in one of two ways. Dynamically, players follow the gradient of the potential on pure strategy space, perturbed by a random variable (motivated by the inherent behavior strategy randomness used to justify a classical mixed-strategy Nash equilibrium). Alternately, a static notion of equilibrium can be used, based on agents arbitraging information out of the system to adapt and improve, as gauged by (Shannon) information entropy. The dynamics are a variation of those that result in a quantal response equilibrium, the difference being instantaneous utility functions are used instead of expected values of utilities to accommodate markets with repeatedly interacting agents. Specifically, the quantal response equilibrium is a mean-field version of the Gibbs measure.

For a finite number of agents, both equilibrium approaches result in the same Gibbs equilibrium measure, where the potential exactly corresponds to the negative of the "energy" in physics. In the context of maximization of information entropy, "conservation of potential" is a constraint on the value of the mean potential, which enforces the degree of non-rational behavior by determining its Lagrange multiplier. This Lagrange multiplier is inverse-temperature, and is inversely proportional to the square of the coefficient of the non-rational Gaussian white noise in the drift-diffusion dynamical model (fluctuation-dissipation theorem). There are some important corollaries to these facts:

even a single player is complex in the sense that their endogenous randomness results in their motion being that for an irreversible dissipative system (convective), which converges to a steady-state Gibbs equilibrium (the concept of equilibrium is qualitatively irreversible, otherwise agents would pass right through it as though it were any other point),

any model in economics that uses, a-priori, a Gibbsian-derived model (standard or mean-field interacting particle systems, such as Curie-Weiss) from statistical mechanics, has a potential (the negative of the Hamiltonian/energy of the a-priori model) and can thus be interpreted in the context of a bounded-rational potential game, and

as a potential refines the Nash equilibriums (eliminates local maximums that aren't global maximums), statistical mechanics can refine the potential by singling out multiple global maximums via spontaneous symmetry breaking (much as minimum free-energy "all up" or "all down" can be selected in a ferromagnet).

This model satisfies the (Bohr) Correspondence Principle for any finite number or infinite number of players, since the Gibbs measure yields the (refined) Nash equilibrium in the limit of zero-temperature perfect rational dynamics; i.e., this model has a limiting classical behavior.

Phase transitions never occur for a finite number of players, but can occur in the infinite-player games, with spontaneous symmetry breaking and multiple equilibriums below a critical "temperature" (degree of non-rational behavior). For high enough non-rational behavior, i.e., high "temperature", there will always be a unique equilibrium state (e.g., Dobrushin uniqueness theorem). These phase transitions give rise to the emergence of self-organized patterns (i.e., phases) which, for example, correspond to different macroscopic buying/selling patterns of agents in a particular Cournot competition.

An economic interpretation of other parameters in the Gibbs formalism, such as "entropy", "magnetization", "susceptibility", etc., as well as scaling interactions (local, power-law decay, global competition or collusion, mixtures of local/global coopetition ), are explained in as well as in an application to a speculative and hedging model.

In this speculative and hedging model, two interdependent markets are examined, with bounded rationality assumptions. The existence of multiple equilibriums is shown to be dependent on certain parameters in the model; i.e., equilibrium(s) depend on the phase of the model - giving a different perspective to the Sonnenschein–Mantel–Debreu theorem.

This model has also been used to prove the "inevitability of collusion" result of Huw Dixon in a case for which the neoclassical version of the model does not predict collusion. Here, a Cournot model of a Veblen good is seen to correspond to an "aligning" potential (the term "ferromagnetic" is used in statistical mechanics). The preference for collusion is due to positivity of certain correlation functions, which is shown using cluster expansion techniques for continuous spin models.

Other models assume an agent has the ability to compute their instantaneous expected payoff by conditionally averaging out other agents' distributions. This results in a mean-field-type model for which the equilibrium is obtained by finding fixed points. The Quantal response equilibrium is the simplification of the Gibbs measure to a mean field model in the case of potential games, and generalizes to games without a potential.

References

Potential game Wikipedia

(Text) CC BY-SA

Contents

Definition

A simple example

Equilibrium selection

Bounded rational models

References