![]() | ||
In information theory there have been various attempts over the years to extend the definition of mutual information to more than two random variables. These attempts have met with a great deal of confusion and a realization that interactions among many random variables are poorly understood.
Contents
Definition
The conditional mutual information can be used to inductively define a multivariate mutual information (MMI) in a set- or measure-theoretic sense in the context of information diagrams. In this sense we define the multivariate mutual information as follows:
where
This definition is identical to that of interaction information except for a change in sign in the case of an odd number of random variables.
Alternatively, the multivariate mutual information may be defined in measure-theoretic terms as the intersection of the individual entropies
Defining
which is identical to the first definition.
Properties
Multi-variate information and conditional multi-variate information can be decomposed into a sum of entropies, by Jakulin & Bratko (2003). The general expression for interaction information on variable set
which is an alternating (inclusion-exclusion) sum over all subsets
Synergy and redundancy
The multivariate mutual information may be positive, negative or zero. For the simplest case of three variables X, Y, and Z, knowing, say, X yields a certain amount of information about Z. This information is just the mutual information I(Z;X) (yellow and gray in the Venn diagram above). Likewise, knowing Y will also yield a certain amount of information about Z, that being the mutual information I(Y;Z) (cyan and gray in the Venn diagram above). The amount of information about Z which is yielded by knowing both X and Y together is the information that is mutual to Z and the X,Y pair, written I(X,Y;Z) (yellow, gray and cyan in the Venn diagram above) and it may be greater than, equal to, or less than the sum of the two mutual informations, this difference being the multivariate mutual information: I(X;Y;Z)=I(Y;Z)+I(Z;X)-I(X,Y;Z). In the case where the sum of the two mutual informations is greater than I(X,Y;Z), the multivariate mutual information will be positive. In this case, some of the information about Z provided by knowing X is also provided by knowing Y, causing their sum to be greater than the information about Z from knowing both together. That is to say, there is a "redundancy" in the information about Z provided by the X and Y variables. In the case where the sum of the mutual informations is less than I(X,Y;Z), the multivariate mutual information will be negative. In this case, knowing both X and Y together provides more information about Z than the sum of the information yielded by knowing either one alone. That is to say, there is a "synergy" in the information about Z provided by the X and Y variables. The above explanation is intended to give an intuitive understanding of the multivariate mutual information, but it obscures the fact that it does not depend upon which variable is the subject (e.g., Z in the above example) and which other two are being thought of as the source of information.
Example of positive multivariate mutual information (redundancy)
Positive MMI is typical of common-cause structures. For example, clouds cause rain and also block the sun; therefore, the correlation between rain and darkness is partly accounted for by the presence of clouds,
Examples of negative multivariate mutual information (synergy)
The case of negative MMI is infamously non-intuitive. A prototypical example of negative
This situation is an instance where fixing the common effect
Battery death and fuel blockage are thus dependent, conditional on their common effect car starting. The obvious directionality in the common-effect graph belies a deep informational symmetry: If conditioning on a common effect increases the dependency between its two parent causes, then conditioning on one of the causes must create the same increase in dependency between the second cause and the common effect. In Pearl's automotive example, if conditioning on car starts induces
Bounds
The bounds for the 3-variable case are
Difficulties
A complication is that this multivariate mutual information (as well as the interaction information) can be positive, negative, or zero, which makes this quantity difficult to interpret intuitively. In fact, for n random variables, there are