Repeated Measures ANOVA

One-way repeated measures ANOVA for three or more within-subject conditions: selection rules, Mauchly sphericity, GG/HF corrections, Inspector output, balanced-design requirements, and current limitations.

Repeated measures ANOVA compares means across three or more conditions when each subject is measured under all conditions. It is the parametric counterpart to Friedman's test and is selected automatically when the data has a paired design, three or more groups, and all groups pass the normality check.


When Licklider selects this test

Licklider selects repeated measures ANOVA when three conditions are met:

  • The design is paired (a subject identifier column is present)
  • There are three or more conditions
  • All groups pass the Shapiro-Wilk normality test

If any group fails the normality check, Friedman's test is selected instead.

For two-condition paired designs, Licklider selects the paired t-test (or Wilcoxon signed-rank if non-normal) rather than repeated measures ANOVA.


Sphericity and corrections

Repeated measures ANOVA assumes sphericity — that the variances of the differences between all pairs of conditions are equal. Licklider tests this assumption automatically using Mauchly's test of sphericity.

When sphericity is violated (Mauchly's test p ≤ 0.05), the degrees of freedom need to be corrected to control the Type I error rate. Licklider applies:

  • Greenhouse-Geisser correction when epsilon < 0.75
  • Huynh-Feldt correction when epsilon ≥ 0.75
  • No correction when sphericity is met

The correction is selected automatically. The p-value shown in the results panel and the statistical results table is the corrected p-value when a correction is applied.

Why this epsilon threshold. Greenhouse-Geisser (1959) is the more conservative estimator and is recommended when epsilon is low (severe sphericity violation). Huynh-Feldt (1976) is less biased when epsilon is close to 1, so it is preferred when violation is mild. The 0.75 cutoff follows the recommendation by Huynh and Feldt (1976) and is the standard adopted by major statistical software (Field, 2013). Mauchly's test uses the conventional alpha = 0.05 significance level for consistency with other assumption checks in Licklider.


Reading the output

When Licklider runs a repeated measures ANOVA, the analysis produces three types of output simultaneously:

  1. Statistical results panel (Inspector). Displays the F statistic, corrected degrees of freedom, p-value, effect size, and sphericity diagnostics.
  2. Figure. A line chart (profile plot) showing the mean and 95% confidence interval for each condition, with individual subject trajectories overlaid as grey lines. This allows you to see both the group trend and within-subject variability at once. The figure is publication-ready and can be exported in SVG or PNG format.
  3. Methods text. An auto-generated methods paragraph describing the test selected, the sphericity correction applied (if any), and the effect size, suitable for inclusion in a manuscript methods section.

The Inspector panel reports:

FieldWhat it means
FF statistic for the within-subjects effect
dfDegrees of freedom (numerator, denominator). May be non-integer when corrected
pp-value. Corrected if sphericity is violated
partial eta-squaredPartial eta-squared effect size
Mauchly's WMauchly's test statistic
Mauchly pp-value for the sphericity test
Correctionnone, Greenhouse-Geisser, or Huynh-Feldt
epsilonEpsilon value for the applied correction

Effect size interpretation (partial eta-squared):

ValueInterpretation
< 0.01Negligible
0.01 – 0.06Small
0.06 – 0.14Medium
≥ 0.14Large

These are conventional benchmarks following Cohen (1988; see References) and should be interpreted in context. Effect sizes should always be considered alongside the research question and practical significance.


Data requirements

Repeated measures ANOVA requires:

  • A subject identifier column (each subject has a unique ID)
  • A condition column (the within-subjects factor)
  • A value column (the measurement)
  • Balanced design: every subject must appear in every condition exactly once

If the design is unbalanced (missing observations for some subject-condition combinations), Licklider will return an error rather than produce potentially misleading results.


Relationship to other tests

ScenarioTest selected
Mixed design (between x within), balancedMixed ANOVA
Paired, 3+ groups, normalRepeated measures ANOVA
Paired, 3+ groups, non-normalFriedman
Paired, 2 groups, normalPaired t-test
Paired, 2 groups, non-normalWilcoxon signed-rank
Independent, 3+ groups, normalOne-way ANOVA
Independent, 3+ groups, non-normalKruskal-Wallis

What Licklider cannot verify

Licklider checks sphericity, normality, and balance automatically. However, several problems cannot be detected from the data alone and remain the researcher's responsibility.

Carryover and order effects. If one condition systematically alters responses in later conditions — through practice, fatigue, sensitization, or learning — the independence assumption of repeated measures ANOVA is violated. This cannot be detected from the data structure; it must be addressed in the experimental design (e.g., counterbalanced condition order across subjects, sufficient washout between sessions).

Inappropriate subject identifier. Licklider requires a subject identifier column and verifies that the design is balanced. It does not verify that observations sharing the same identifier genuinely come from the same biological or experimental unit. Misassigning identifiers produces a structurally valid but scientifically invalid analysis.

Temporal or spatial correlation. When repeated measurements are spaced closely in time or space, residual autocorrelation may remain even after sphericity correction. Licklider does not model autocorrelation structure; if this is a concern, consider a linear mixed model (see Repeated Measures and Mixed Models).


Current limitations

  • Only one-way (single within-subjects factor) is supported
  • Mixed ANOVA is available as a separate method page for 1 between x 1 within designs
  • Two-way repeated measures (within x within) is not yet available
  • Post hoc pairwise comparisons are not yet available in this version. For pairwise follow-up, use paired t-tests with Bonferroni correction
  • Unbalanced designs are not supported

References

Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences (2nd ed.). Lawrence Erlbaum Associates. — Source for partial eta-squared interpretation benchmarks (small = 0.01, medium = 0.06, large = 0.14).

Greenhouse, S. W., & Geisser, S. (1959). On methods in the analysis of profile data. Psychometrika, 24(2), 95–112. — Original derivation of the Greenhouse-Geisser epsilon correction for sphericity violation.

Huynh, H., & Feldt, L. S. (1976). Estimation of the Box correction for degrees of freedom from sample data in randomized block and split-plot designs. Journal of Educational Statistics, 1(1), 69–82. — Original derivation of the Huynh-Feldt correction and the recommendation to prefer it over GG when epsilon ≥ 0.75.

Mauchly, J. W. (1940). Significance test for sphericity of a normal n-variate distribution. The Annals of Mathematical Statistics, 11(2), 204–209. — Original derivation of Mauchly's test of sphericity.