One-Way ANOVA | Licklider

One-way ANOVA tests whether the means of three or more independent groups differ from each other. The omnibus F-test tells you that at least one group differs; post hoc tests tell you which pairs. Licklider supports both classic and Welch one-way ANOVA in the current route, and classic output reports eta^2 alongside omega^2.

When to use one-way ANOVA

Use one-way ANOVA when you have:

A continuous outcome variable
Three or more independent groups defined by a single categorical factor
Approximately normal distributions within each group, or sample sizes large enough that the central limit theorem gives you some robustness

If you have exactly two groups, use a t-Test instead. ANOVA and Welch's t-test agree for two groups, but the t-test output is easier to read and keeps the direction of the difference obvious.

If your data are ordinal or clearly non-normal with small samples, see Non-Parametric Alternatives for Kruskal-Wallis and other rank-based alternatives that are available in the current product.

If you have two categorical factors, use Two-Way ANOVA.

Assumptions

Licklider checks several data-shape assumptions automatically. Results appear in the Assumptions panel rather than inside the omnibus ANOVA computation itself.

These checks reduce common mistakes, but they do not validate your study design for you. In particular, Licklider can test normality and variance patterns in the observed table, but it cannot infer the correct observation unit or determine whether rows that look separate are actually repeated measurements from the same animal, patient, plate, well, litter, cage, or technical replicate set unless that structure is explicit in your data contract.

Normality

Licklider runs Shapiro-Wilk on each group. Groups with fewer than 3 observations or more than 5,000 observations are skipped. A flag appears when the grouped normality checks suggest the data are not comfortably normal.

One-way ANOVA is moderately robust to non-normality when group sizes are reasonably similar and not very small. With smaller or more unbalanced groups, the normality flags matter more. If several groups are flagged, review the Non-Parametric Alternatives page and decide whether a rank-based path is more defensible.

Variance homogeneity

Licklider runs a median-based Levene test across all groups. A flag appears when the result suggests unequal variances.

Warning If Levene flags unequal variances, the classic one-way ANOVA result still appears, but Welch's one-way ANOVA is the better-supported omnibus choice in that setting. Licklider's current panel also recommends Games-Howell as the unequal-variance post hoc companion. When variance differences are substantial, prefer Welch over the classic equal-variance result and review the Non-Parametric Alternatives page if distributional assumptions also look weak.

Independence

Each observation must come from a different subject. If the same subjects appear under each condition, one-way ANOVA is the wrong model. Repeated-measures ANOVA is coming soon for that design family.

Important: Licklider does not automatically detect pseudoreplication or hidden non-independence when rows look separate in the table. If multiple rows come from the same animal, plate, well, litter, cage, donor, or technical replicate set, ordinary one-way ANOVA can underestimate uncertainty and make p-values look smaller than they should.

If your data are clustered, nested, or repeatedly measured, declare the observation unit explicitly during setup and review Observation Unit Declaration and Pseudoreplication Detection before relying on this result.

Sample size

Licklider warns when any group has fewer than 5 observations. The figure pipeline also skips one-way ANOVA metadata when fewer than 3 groups remain after cleaning and it requires at least 6 valid observations in total to proceed.

The omnibus F-test

The omnibus F-test answers one question: is the between-group variability large relative to the within-group variability? A statistically significant F tells you that at least one group mean differs. It does not tell you which groups differ. That is what post hoc testing adds.

Reading the output

Field	What it means
F	Ratio of between-group variance to within-group variance. Larger values mean the group means are farther apart relative to within-group spread.
df (between)	Number of groups minus 1.
df (within)	Total observations minus number of groups.
p-value	Probability of observing an F at least this large if all group means were equal.
eta^2	Proportion of total variance explained by group membership.
omega^2	Less biased explained-variance estimate for the classic equal-variance model.

eta^2 interpretation

These are conventional benchmarks from Cohen (1988). You should still interpret effect sizes in the context of the assay, endpoint, and field.

eta^2	Conventional label
0.01	Small
0.06	Medium
0.14	Large

Warning eta^2 has an upward bias in small samples. Licklider also reports omega^2 for the classic equal-variance model because it is typically less biased. In Welch mode, the output stays focused on F, degrees of freedom, and p-value rather than reporting eta^2 and omega^2.

Post hoc tests

Run post hoc tests after a statistically significant omnibus F when you need to identify the specific pairs that differ. Licklider currently supports Tukey HSD, Bonferroni, Scheffe, Dunnett, and Games-Howell in the one-way ANOVA path.

Choosing a post hoc test

Method	Comparison structure	When to use
Tukey HSD	All pairwise	Default for exploratory analyses with balanced or near-balanced groups
Bonferroni	All pairwise	Preferred when group sizes are small or replicates are few; more conservative than Tukey
Scheffe	All pairwise	Most conservative; useful when comparisons were not tightly pre-specified
Dunnett	Each treatment vs one control	Best when every treatment is compared against one declared control
Games-Howell	All pairwise under unequal variances	Use when variance homogeneity is doubtful and you want a pairwise follow-up that matches Welch-style assumptions

Licklider defaults to Tukey HSD. For low-replicate experiments, Licklider recommends Bonferroni because early-phase life science designs can need stronger Type I error control when replicate counts stay small.

Games-Howell is available as the unequal-variance post hoc option in the current one-way ANOVA path. When Levene suggests heteroscedasticity, it is the follow-up most naturally paired with Welch one-way ANOVA.

Reading the post hoc comparison table

Column	What it means
Group A / Group B	The two groups being compared
Mean difference	Mean of A minus mean of B. The sign tells you the direction
95% CI	Confidence interval for the mean difference
p (adjusted)	p-value after the selected multiple-comparison correction
Cohen's d	Standardized effect size for that pair

A non-significant adjusted p-value means the data remain compatible with no pairwise difference after correction. It does not prove the groups are equal. Read Cohen's d next to the adjusted p-value so you do not confuse detectability with practical magnitude.

The UI panel shows up to 20 comparisons at once. The full comparison table is preserved in the export bundle's stats_meta.json.

Dunnett-specific notes

Dunnett requires a designated control group. If you select Dunnett without a control group, the API returns an error instead of guessing one for you.

On older SciPy builds where Dunnett is unavailable, the backend falls back to Bonferroni for control-vs-treatment comparisons and records that fallback in the returned notes.

Example

Scenario

A researcher tests three drug concentrations (0 uM, 10 uM, 50 uM) on proliferation rate in a cell line. Each condition has 10 independent replicates.

Result (hypothetical)

One-way ANOVA: F(2, 27) = 9.43, p = .001, eta^2 = 0.41

Post hoc (Tukey HSD):

0 uM vs 10 uM: mean diff = +0.31, 95% CI [0.08, 0.54], p(adj) = .007, d = 0.82
0 uM vs 50 uM: mean diff = +0.58, 95% CI [0.35, 0.81], p(adj) < .001, d = 1.54
10 uM vs 50 uM: mean diff = +0.27, 95% CI [0.04, 0.50], p(adj) = .018, d = 0.71

Interpretation

The omnibus result shows that proliferation differs across concentration conditions. The effect is large on the conventional eta^2 scale, which means group membership explains a substantial share of the variance in proliferation.

The post hoc table then tells you where the differences sit. In this example, all three pairwise contrasts remain statistically reliable after Tukey correction, and the Cohen's d values indicate that the 0 uM to 50 uM difference is especially large. The overall pattern supports a dose-response interpretation rather than a single isolated pairwise effect.

Design Rationale & References

Licklider's design choices

Licklider defaults to Tukey HSD as a practical all-pairs baseline for balanced exploratory designs. For low-replicate experiments, Licklider recommends Bonferroni because recent life science evidence argues that Tukey can over-permit false positives when replicate counts stay small (Zweifach, 2025). For unequal variances, the current panel recommends Welch one-way ANOVA together with Games-Howell so the omnibus and pairwise follow-up stay aligned with the unequal-variance setting. In the classic equal-variance path, Licklider reports both eta^2 and omega^2 so readers can see a familiar explained-variance summary alongside a less biased companion estimate (Lakens, 2013). Exact p-values replace threshold labels throughout the interface, consistent with ASA guidance (Wasserstein & Lazar, 2016).

Methodological foundations

Lakens, D. (2013). Calculating and reporting effect sizes to facilitate cumulative science: A practical primer for t-tests and ANOVAs. Frontiers in Psychology, 4, 863. https://doi.org/10.3389/fpsyg.2013.00863
Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences (2nd ed.). Routledge.
Sullivan, G. M., & Feinn, R. (2012). Using effect size - or why the p-value is not enough. Journal of Graduate Medical Education, 4(3), 279-282. https://doi.org/10.4300/JGME-D-12-00156.1

Known limitations

Nakagawa, S. (2004). A farewell to Bonferroni: The problems of low statistical power and publication bias. Behavioral Ecology, 15(6), 1044-1045. https://doi.org/10.1093/beheco/arh107
Zweifach, A. (2025). Bonferroni's method, not Tukey's, should be used to control the total number of false positives when making multiple pairwise comparisons in experiments with few replicates. SLAS Discovery, 35, 100253. https://doi.org/10.1016/j.slasd.2025.100253

Paradigm shifts worth knowing

Wasserstein, R. L., & Lazar, N. A. (2016). The ASA statement on p-values: Context, process, and purpose. The American Statistician, 70(2), 129-133. https://doi.org/10.1080/00031305.2016.1154108

Implementation boundaries

The current one-way ANOVA route supports both the classic equal-variance formulation and Welch's one-way ANOVA.
The classic path reports eta^2 and omega^2. Welch output keeps effect-size fields null in the current engine response.
The figure pipeline samples up to 20,000 rows before it calls the engine. The API route accepts up to 30,000 rows.
The figure pipeline skips ANOVA metadata when fewer than 3 groups remain after cleaning and marks that state as requires_three_or_more_groups.
Dunnett requires control_group. If the control group is missing or unknown, the API responds with an error.
Games-Howell is accepted in the current route and engine as the unequal-variance pairwise option.
Licklider does not infer whether each row is the correct independent observation unit. If biological replicates, technical replicates, repeated measurements, or nested units are mixed together without an explicit declaration, the software may still run one-way ANOVA even though the design is not valid for it.
Licklider does not decide whether a declared control group is scientifically appropriate for Dunnett, or whether your comparison family matches your protocol. Those choices must come from the study design, not from the observed result.