In statistics, Welch's t-test, or unequal variances t-test, is a two-sample location test which is used to test the hypothesis that two populations have equal means. Welch's t-test is an adaptation of Student's t-test, that is, it has been derived with the help of Student's t-test and is more reliable when the two samples have unequal variances and unequal sample sizes. These tests are often referred to as "unpaired" or "independent samples" t-tests, as they are typically applied when the statistical units underlying the two samples being compared are non-overlapping. Given that Welch's t-test has been less popular than Student's t-test and may be less familiar to readers, a more informative name is "Welch's unequal variances t-test" or "unequal variances t-test" for brevity.
Contents
Assumptions
Student's t-test assumes that the two populations have normal distributions and with equal variances. Welch's t-test is designed for unequal variances, but the assumption of normality is maintained. Welch's t-test is an approximate solution to the Behrens–Fisher problem.
Calculations
Welch's t-test defines the statistic t by the following formula:
where
The degrees of freedom
Here
Welch's t-test can also be calculated for ranked data and might then be named Welch's U-test.
Statistical test
Once t and
Advantages and limitations
Welch's t-test is more robust than Student's t-test and maintains type I error rates close to nominal for unequal variances and for unequal sample sizes. Furthermore, the power of Welch's t-test comes close to that of Student's t-test, even when the population variances are equal and sample sizes are balanced. Welch's t-test can be generalized to more than 2-samples, which is more robust than one-way analysis of variance (ANOVA).
It is not recommended to pre-test for equal variances and then choose between Student's t-test or Welch's t-test. Rather, Welch's t-test can be applied directly and without any substantial disadvantages to Student's t-test as noted above. Welch's t-test remains robust for skewed distributions and large sample sizes. Reliability decreases for skewed distributions and smaller samples, where one could possibly perform Welch's t-test on ranked data.
Examples
The following three examples compare Welch's t-test and Student's t-test. Samples are from random normal distributions using the R programming language.
For all three examples, the population means were
The first example is for equal variances (
The second example is for unequal variances (
The third example is for unequal variances (
Reference p-values were obtained by simulating the distributions of the t statistics for the null hypothesis of equal population means (
Welch's t-test and Student's t-test gave practically identical results for the two samples with equal variances and equal sample sizes (Example 1). For unequal variances, Student's t-test gave a low p-value when the smaller sample had a larger variance (Example 2) and a high p-value when the larger sample had a larger variance (Example 3). For unequal variances, Welch's t-test gave p-values close to simulated p-values.