Welch's t test - Alchetron, The Free Social Encyclopedia

In statistics, Welch's t-test, or unequal variances t-test, is a two-sample location test which is used to test the hypothesis that two populations have equal means. Welch's t-test is an adaptation of Student's t-test, that is, it has been derived with the help of Student's t-test and is more reliable when the two samples have unequal variances and unequal sample sizes. These tests are often referred to as "unpaired" or "independent samples" t-tests, as they are typically applied when the statistical units underlying the two samples being compared are non-overlapping. Given that Welch's t-test has been less popular than Student's t-test and may be less familiar to readers, a more informative name is "Welch's unequal variances t-test" or "unequal variances t-test" for brevity.

Assumptions

Student's t-test assumes that the two populations have normal distributions and with equal variances. Welch's t-test is designed for unequal variances, but the assumption of normality is maintained. Welch's t-test is an approximate solution to the Behrens–Fisher problem.

Calculations

Welch's t-test defines the statistic t by the following formula:

t = X ¯ 1 − X ¯ 2 s 1 2 N 1 + s 2 2 N 2

where X ¯ 1 , s 1 2 and N 1 are the 1st sample mean, sample variance and sample size, respectively. Unlike in Student's t-test, the denominator is not based on a pooled variance estimate.

The degrees of freedom ν associated with this variance estimate is approximated using the Welch–Satterthwaite equation:

ν ≈ ( s 1 2 N 1 + s 2 2 N 2 ) 2 s 1 4 N 1 2 ν 1 + s 2 4 N 2 2 ν 2

Here ν 1 = N 1 − 1 , the degrees of freedom associated with the first variance estimate. ν 2 = N 2 − 1 , the degrees of freedom associated with the 2nd variance estimate.

Welch's t-test can also be calculated for ranked data and might then be named Welch's U-test.

Statistical test

Once t and ν have been computed, these statistics can be used with the t-distribution to test the null hypothesis that the two population means are equal (using a two-tailed test), or the alternative hypothesis that one of the population means is greater than or equal to the other (using a one-tailed test). The approximate degrees of freedom is rounded down to the nearest integer.

Advantages and limitations

Welch's t-test is more robust than Student's t-test and maintains type I error rates close to nominal for unequal variances and for unequal sample sizes. Furthermore, the power of Welch's t-test comes close to that of Student's t-test, even when the population variances are equal and sample sizes are balanced. Welch's t-test can be generalized to more than 2-samples, which is more robust than one-way analysis of variance (ANOVA).

It is not recommended to pre-test for equal variances and then choose between Student's t-test or Welch's t-test. Rather, Welch's t-test can be applied directly and without any substantial disadvantages to Student's t-test as noted above. Welch's t-test remains robust for skewed distributions and large sample sizes. Reliability decreases for skewed distributions and smaller samples, where one could possibly perform Welch's t-test on ranked data.

Examples

The following three examples compare Welch's t-test and Student's t-test. Samples are from random normal distributions using the R programming language.

For all three examples, the population means were μ 1 = 20 and μ 2 = 22 .

The first example is for equal variances ( σ 1 2 = σ 2 2 = 4 ) and equal sample sizes ( N 1 = N 2 = 15 ). Let A1 and A2 denote two random samples:

A 1 = { 27.5 , 21.0 , 19.0 , 23.6 , 17.0 , 17.9 , 16.9 , 20.1 , 21.9 , 22.6 , 23.1 , 19.6 , 19.0 , 21.7 , 21.4 } A 2 = { 27.1 , 22.0 , 20.8 , 23.4 , 23.4 , 23.5 , 25.8 , 22.0 , 24.8 , 20.2 , 21.9 , 22.1 , 22.9 , 20.5 , 24.4 }

The second example is for unequal variances ( σ 1 2 = 16 , σ 2 2 = 1 ) and unequal sample sizes ( N 1 = 10 , N 2 = 20 ). The smaller sample has the larger variance:

A 1 = { 17.2 , 20.9 , 22.6 , 18.1 , 21.7 , 21.4 , 23.5 , 24.2 , 14.7 , 21.8 } A 2 = { 21.5 , 22.8 , 21.0 , 23.0 , 21.6 , 23.6 , 22.5 , 20.7 , 23.4 , 21.8 , 20.7 , 21.7 , 21.5 , 22.5 , 23.6 , 21.5 , 22.5 , 23.5 , 21.5 , 21.8 }

The third example is for unequal variances ( σ 1 2 = 1 , σ 2 2 = 16 ) and unequal sample sizes ( N 1 = 10 , N 2 = 20 ). The larger sample has the larger variance:

A 1 = { 19.8 , 20.4 , 19.6 , 17.8 , 18.5 , 18.9 , 18.3 , 18.9 , 19.5 , 22.0 } A 2 = { 28.2 , 26.6 , 20.1 , 23.3 , 25.2 , 22.1 , 17.7 , 27.6 , 20.6 , 13.7 , 23.2 , 17.5 , 20.6 , 18.0 , 23.9 , 21.6 , 24.3 , 20.4 , 24.0 , 13.2 }

Reference p-values were obtained by simulating the distributions of the t statistics for the null hypothesis of equal population means ( μ 1 − μ 2 = 0 ). Results are summarised in the table below, with two-tailed p-values:

Welch's t-test and Student's t-test gave practically identical results for the two samples with equal variances and equal sample sizes (Example 1). For unequal variances, Student's t-test gave a low p-value when the smaller sample had a larger variance (Example 2) and a high p-value when the larger sample had a larger variance (Example 3). For unequal variances, Welch's t-test gave p-values close to simulated p-values.

References

Welch's t-test Wikipedia

(Text) CC BY-SA