Suvarna Garge (Editor)

Tversky index

Updated on
Edit
Like
Comment
Share on FacebookTweet on TwitterShare on LinkedInShare on Reddit

The Tversky index, named after Amos Tversky, is an asymmetric similarity measure on sets that compares a variant to a prototype. The Tversky index can be seen as a generalization of Dice's coefficient and Tanimoto coefficient.

For sets X and Y the Tversky index is a number between 0 and 1 given by

S ( X , Y ) = | X Y | | X Y | + α | X Y | + β | Y X | ,

Here, X Y denotes the relative complement of Y in X.

Further, α , β 0 are parameters of the Tversky index. Setting α = β = 1 produces the Tanimoto coefficient; setting α = β = 0.5 produces Dice's coefficient.

If we consider X to be the prototype and Y to be the variant, then α corresponds to the weight of the prototype and β corresponds to the weight of the variant. Tversky measures with α + β = 1 are of special interest.

Because of the inherent asymmetry, the Tversky index does not meet the criteria for a similarity metric. However, if symmetry is needed a variant of the original formulation has been proposed using max and min functions .

S ( X , Y ) = | X Y | | X Y | + β ( α a + ( 1 α ) b ) ,

a = min ( | X Y | , | Y X | ) ,

b = max ( | X Y | , | Y X | ) ,

This formulation also re-arranges parameters α and β . Thus, α controls the balance between | X Y | and | Y X | in the denominator. Similarly, β controls the effect of the symmetric difference | X Y | versus | X Y | in the denominator.

References

Tversky index Wikipedia