In statistics, Tschuprow's T is a measure of association between two nominal variables, giving a value between 0 and 1 (inclusive). It is closely related to Cramér's V, coinciding with it for square contingency tables. It was published by Alexander Tschuprow (alternative spelling: Chuprov) in 1939.
For an r × c contingency table with r rows and c columns, let
π
i
j
be the proportion of the population in cell
(
i
,
j
)
and let
π
i
+
=
∑
j
=
1
c
π
i
j
and
π
+
j
=
∑
i
=
1
r
π
i
j
.
Then the mean square contingency is given as
ϕ
2
=
∑
i
=
1
r
∑
j
=
1
c
(
π
i
j
−
π
i
+
π
+
j
)
2
π
i
+
π
+
j
,
and Tschuprow's T as
T
=
ϕ
2
(
r
−
1
)
(
c
−
1
)
.
T equals zero if and only if independence holds in the table, i.e., if and only if
π
i
j
=
π
i
+
π
+
j
. T equals one if and only there is perfect dependence in the table, i.e., if and only if for each i there is only one j such that
π
i
j
>
0
and vice versa. Hence, it can only equal 1 for square tables. In this it differs from Cramér's V, which can be equal to 1 for any rectangular table.
If we have a multinomial sample of size n, the usual way to estimate T from the data is via the formula
T
^
=
∑
i
=
1
r
∑
j
=
1
c
(
p
i
j
−
p
i
+
p
+
j
)
2
p
i
+
p
+
j
(
r
−
1
)
(
c
−
1
)
,
where
p
i
j
=
n
i
j
/
n
is the proportion of the sample in cell
(
i
,
j
)
. This is the empirical value of T. With
χ
2
the Pearson chi-square statistic, this formula can also be written as
T
^
=
χ
2
/
n
(
r
−
1
)
(
c
−
1
)
.