Supplementary data for the paper "Why psychologists should not default to Welch’s t-test instead of Student’s t-test (and why the Anderson–Darling test is an underused alternative)"

DOI:10.4121/e8e6861a-7ab0-4b6d-bd67-5f95029322c5.v1

Datacite citation style

de Winter, Joost (2025): Supplementary data for the paper "Why psychologists should not default to Welch’s t-test instead of Student’s t-test (and why the Anderson–Darling test is an underused alternative)". Version 1. 4TU.ResearchData. dataset. https://doi.org/10.4121/e8e6861a-7ab0-4b6d-bd67-5f95029322c5.v1

Other citation styles (APA, Harvard, MLA, Vancouver, Chicago, IEEE) available at Datacite

Dataset

Version 2 - 2025-05-12 (latest)

Version 1 - 2025-04-28

Keywords

applied statistics simulations Student t-test Welch's test

Licence

CC BY 4.0

Interoperability

RO-Crate Metadata

Export as...

RefWorks BibTeX Reference Manager Endnote DataCite NLM DC CFF

by Joost de Winter

This paper evaluates the claim that Welch’s t-test (WT) should replace the independent-samples t-test (IT) as the default approach for comparing sample means. Through simulations of unequal and equal variances, skewed distributions, and varying sample sizes, we confirm that the WT effectively controls false positives when the smaller sample is drawn from the population with the larger standard deviation. However, the WT is found to yield inflated false positive rates under skewed distributions, even for relatively large sample sizes. By contrast, IT exhibits higher statistical power when standard deviations are equal, and avoids the inflation of false positives under skew. A complementary empirical study based on gender differences in two psychological scales corroborates these findings. Furthermore, an additional analysis using the Kolmogorov–Smirnov and Anderson–Darling tests demonstrated that examining entire distributions rather than just their means can provide a more robust alternative when facing unequal variances or skewed data. Given these results, insisting on WT as a universal default appears unwarranted. Researchers should remain cautious with software defaults, such as R favoring Welch’s test.

History

2025-04-28 first online, published, posted

Publisher

4TU.ResearchData

Format

script/m-file; data/xlsx

Organizations

TU Delft, Faculty of Mechanical Engineering, Department of Cognitive Robotics

DATA

Files (1)

85,603,753 bytesMD5:ecbd0e8fd9bd3f6d1d6abe15d843c933MATLAB data and code.zip