Supplementary data for the paper "Why psychologists should not default to Welch’s t-test instead of Student’s t-test (and why the Anderson–Darling test is an underused alternative)"

DOI:10.4121/e8e6861a-7ab0-4b6d-bd67-5f95029322c5.v3

Datacite citation style

de Winter, Joost (2025): Supplementary data for the paper "Why psychologists should not default to Welch’s t-test instead of Student’s t-test (and why the Anderson–Darling test is an underused alternative)". Version 3. 4TU.ResearchData. dataset. https://doi.org/10.4121/e8e6861a-7ab0-4b6d-bd67-5f95029322c5.v3

Other citation styles (APA, Harvard, MLA, Vancouver, Chicago, IEEE) available at Datacite

Dataset

Version 4 - 2025-06-02 (latest)

Version 3 - 2025-06-01

Version 2 - 2025-05-12 Version 1 - 2025-04-28

Keywords

applied statistics simulations Student t-test Welch's test

Licence

CC BY 4.0

Interoperability

RO-Crate Metadata

Export as...

RefWorks BibTeX Reference Manager Endnote DataCite NLM DC CFF

by Joost de Winter

This paper evaluates the claim that Welch’s t-test (WT) should replace the independent-samples t-test (IT) as the default approach for comparing sample means. Simulations involving unequal and equal variances, skewed distributions, and different sample sizes were performed. For normal distributions, we confirm that the WT maintains the false positive rate close to the nominal level of 0.05 when sample sizes and standard deviations are unequal. However, the WT was found to yield inflated false positive rates under skewed distributions, even with relatively large sample sizes, whereas the IT avoids such inflation. A complementary empirical study based on gender differences in two psychological scales corroborates these findings. Finally, we contend that the null hypothesis of unequal variances together with equal means lacks plausibility, and that empirically, a difference in means typically coincides with differences in variance and skewness. An additional analysis using the Kolmogorov-Smirnov and Anderson-Darling tests demonstrates that examining entire distributions, rather than just their means, can provide a more suitable alternative when facing unequal variances or skewed distributions. Given these results, researchers should remain cautious with software defaults, such as R favoring Welch’s test.

History

2025-04-28 first online

2025-06-01 published, posted

Publisher

4TU.ResearchData

Format

script/m-file; data/xlsx

Organizations

TU Delft, Faculty of Mechanical Engineering, Department of Cognitive Robotics

DATA

Files (1)

85,496,032 bytesMD5:6dbaa6acd893ea3059216d017201164fMATLAB data and code v3.zip