![]() | ||
In probability theory and information theory, the variation of information or shared information distance is a measure of the distance between two clusterings (partitions of elements). It is closely related to mutual information; indeed, it is a simple linear expression involving the mutual information. Unlike the mutual information, however, the variation of information is a true metric, in that it obeys the triangle inequality.
Contents
Definition
Suppose we have two partitions
This is equivalent to the shared information distance between the random variables i and j with respect to the uniform probability measure on
Identities
The variation of information satisfies
where
where
where