One-Way ANOVA
Use this page to decide when one-way ANOVA fits your design, how to read the omnibus and post hoc outputs, and where the current implementation still has limits.
One-way ANOVA tests whether the means of three or more independent groups differ from each other. The omnibus F-test tells you that at least one group differs; post hoc tests tell you which pairs. Licklider supports both classic and Welch one-way ANOVA in the current route, and classic output reports eta^2 alongside omega^2.
When to use one-way ANOVA
Use one-way ANOVA when you have:
- A continuous outcome variable
- Three or more independent groups defined by a single categorical factor
- Approximately normal distributions within each group, or sample sizes large enough that the central limit theorem gives you some robustness
If you have exactly two groups, use a t-Test instead. ANOVA and Welch's t-test agree for two groups, but the t-test output is easier to read and keeps the direction of the difference obvious.
If your data are ordinal or clearly non-normal with small samples, see Non-Parametric Alternatives for Kruskal-Wallis and other rank-based alternatives that are available in the current product.
If you have two categorical factors, use Two-Way ANOVA.
Assumptions
Licklider checks several data-shape assumptions automatically. Results appear in the Assumptions panel rather than inside the omnibus ANOVA computation itself.
These checks reduce common mistakes, but they do not validate your study design for you. In particular, Licklider can test normality and variance patterns in the observed table, but it cannot infer the correct observation unit or determine whether rows that look separate are actually repeated measurements from the same animal, patient, plate, well, litter, cage, or technical replicate set unless that structure is explicit in your data contract.
Normality
Licklider runs Shapiro-Wilk on each group. Groups with fewer than 3 observations or more than 5,000 observations are skipped. A flag appears when the grouped normality checks suggest the data are not comfortably normal.
One-way ANOVA is moderately robust to non-normality when group sizes are reasonably similar and not very small. With smaller or more unbalanced groups, the normality flags matter more. If several groups are flagged, review the Non-Parametric Alternatives page and decide whether a rank-based path is more defensible.
Variance homogeneity
Licklider runs a median-based Levene test across all groups. A flag appears when the result suggests unequal variances.
Warning If Levene flags unequal variances, the classic one-way ANOVA result still appears, but Welch's one-way ANOVA is the better-supported omnibus choice in that setting. Licklider's current panel also recommends Games-Howell as the unequal-variance post hoc companion. When variance differences are substantial, prefer Welch over the classic equal-variance result and review the Non-Parametric Alternatives page if distributional assumptions also look weak.
Independence
Each observation must come from a different subject. If the same subjects appear under each condition, one-way ANOVA is the wrong model. Repeated-measures ANOVA is coming soon for that design family.
Important: Licklider does not automatically detect pseudoreplication or hidden non-independence when rows look separate in the table. If multiple rows come from the same animal, plate, well, litter, cage, donor, or technical replicate set, ordinary one-way ANOVA can underestimate uncertainty and make p-values look smaller than they should.
If your data are clustered, nested, or repeatedly measured, declare the observation unit explicitly during setup and review Observation Unit Declaration and Pseudoreplication Detection before relying on this result.
Sample size
Licklider warns when any group has fewer than 5 observations. The figure pipeline also skips one-way ANOVA metadata when fewer than 3 groups remain after cleaning and it requires at least 6 valid observations in total to proceed.
The omnibus F-test
The omnibus F-test answers one question: is the between-group variability large relative to the within-group variability? A statistically significant F tells you that at least one group mean differs. It does not tell you which groups differ. That is what post hoc testing adds.
Reading the output
| Field | What it means |
|---|---|
| F | Ratio of between-group variance to within-group variance. Larger values mean the group means are farther apart relative to within-group spread. |
| df (between) | Number of groups minus 1. |
| df (within) | Total observations minus number of groups. |
| p-value | Probability of observing an F at least this large if all group means were equal. |
| eta^2 | Proportion of total variance explained by group membership. |
| omega^2 | Less biased explained-variance estimate for the classic equal-variance model. |
eta^2 interpretation
These are conventional benchmarks from Cohen (1988). You should still interpret effect sizes in the context of the assay, endpoint, and field.
| eta^2 | Conventional label |
|---|---|
| 0.01 | Small |
| 0.06 | Medium |
| 0.14 | Large |
Warning eta^2 has an upward bias in small samples. Licklider also reports omega^2 for the classic equal-variance model because it is typically less biased. In Welch mode, the output stays focused on F, degrees of freedom, and p-value rather than reporting eta^2 and omega^2.
Post hoc tests
Run post hoc tests after a statistically significant omnibus F when you need to identify the specific pairs that differ. Licklider currently supports Tukey HSD, Bonferroni, Scheffe, Dunnett, and Games-Howell in the one-way ANOVA path.
Choosing a post hoc test
| Method | Comparison structure | When to use |
|---|---|---|
| Tukey HSD | All pairwise | Default for exploratory analyses with balanced or near-balanced groups |
| Bonferroni | All pairwise | Preferred when group sizes are small or replicates are few; more conservative than Tukey |
| Scheffe | All pairwise | Most conservative; useful when comparisons were not tightly pre-specified |
| Dunnett | Each treatment vs one control | Best when every treatment is compared against one declared control |
| Games-Howell | All pairwise under unequal variances | Use when variance homogeneity is doubtful and you want a pairwise follow-up that matches Welch-style assumptions |
Licklider defaults to Tukey HSD. For low-replicate experiments, Licklider recommends Bonferroni because early-phase life science designs can need stronger Type I error control when replicate counts stay small.
Games-Howell is available as the unequal-variance post hoc option in the current one-way ANOVA path. When Levene suggests heteroscedasticity, it is the follow-up most naturally paired with Welch one-way ANOVA.
Reading the post hoc comparison table
| Column | What it means |
|---|---|
| Group A / Group B | The two groups being compared |
| Mean difference | Mean of A minus mean of B. The sign tells you the direction |
| 95% CI | Confidence interval for the mean difference |
| p (adjusted) | p-value after the selected multiple-comparison correction |
| Cohen's d | Standardized effect size for that pair |
A non-significant adjusted p-value means the data remain compatible with no pairwise difference after correction. It does not prove the groups are equal. Read Cohen's d next to the adjusted p-value so you do not confuse detectability with practical magnitude.
The UI panel shows up to 20 comparisons at once. The full comparison table is preserved in the export bundle's stats_meta.json.
Dunnett-specific notes
Dunnett requires a designated control group. If you select Dunnett without a control group, the API returns an error instead of guessing one for you.
On older SciPy builds where Dunnett is unavailable, the backend falls back to Bonferroni for control-vs-treatment comparisons and records that fallback in the returned notes.
Example
Scenario
A researcher tests three drug concentrations (0 uM, 10 uM, 50 uM) on proliferation rate in a cell line. Each condition has 10 independent replicates.
Result (hypothetical)
One-way ANOVA: F(2, 27) = 9.43, p = .001, eta^2 = 0.41
Post hoc (Tukey HSD):
0 uMvs10 uM: mean diff =+0.31, 95% CI[0.08, 0.54],p(adj) = .007,d = 0.820 uMvs50 uM: mean diff =+0.58, 95% CI[0.35, 0.81],p(adj) < .001,d = 1.5410 uMvs50 uM: mean diff =+0.27, 95% CI[0.04, 0.50],p(adj) = .018,d = 0.71
Interpretation
The omnibus result shows that proliferation differs across concentration conditions. The effect is large on the conventional eta^2 scale, which means group membership explains a substantial share of the variance in proliferation.
The post hoc table then tells you where the differences sit. In this example, all three pairwise contrasts remain statistically reliable after Tukey correction, and the Cohen's d values indicate that the 0 uM to 50 uM difference is especially large. The overall pattern supports a dose-response interpretation rather than a single isolated pairwise effect.
Design Rationale & References
Licklider's design choices
Licklider defaults to Tukey HSD as a practical all-pairs baseline for balanced exploratory designs. For low-replicate experiments, Licklider recommends Bonferroni because recent life science evidence argues that Tukey can over-permit false positives when replicate counts stay small (Zweifach, 2025). For unequal variances, the current panel recommends Welch one-way ANOVA together with Games-Howell so the omnibus and pairwise follow-up stay aligned with the unequal-variance setting. In the classic equal-variance path, Licklider reports both eta^2 and omega^2 so readers can see a familiar explained-variance summary alongside a less biased companion estimate (Lakens, 2013). Exact p-values replace threshold labels throughout the interface, consistent with ASA guidance (Wasserstein & Lazar, 2016).
Methodological foundations
- Lakens, D. (2013). Calculating and reporting effect sizes to facilitate cumulative science: A practical primer for t-tests and ANOVAs. Frontiers in Psychology, 4, 863. https://doi.org/10.3389/fpsyg.2013.00863
- Cohen, J. (1988). Statistical Power Analysis for the Behavioral Sciences (2nd ed.). Routledge.
- Sullivan, G. M., & Feinn, R. (2012). Using effect size - or why the p-value is not enough. Journal of Graduate Medical Education, 4(3), 279-282. https://doi.org/10.4300/JGME-D-12-00156.1
Known limitations
- Nakagawa, S. (2004). A farewell to Bonferroni: The problems of low statistical power and publication bias. Behavioral Ecology, 15(6), 1044-1045. https://doi.org/10.1093/beheco/arh107
- Zweifach, A. (2025). Bonferroni's method, not Tukey's, should be used to control the total number of false positives when making multiple pairwise comparisons in experiments with few replicates. SLAS Discovery, 35, 100253. https://doi.org/10.1016/j.slasd.2025.100253
Paradigm shifts worth knowing
- Wasserstein, R. L., & Lazar, N. A. (2016). The ASA statement on p-values: Context, process, and purpose. The American Statistician, 70(2), 129-133. https://doi.org/10.1080/00031305.2016.1154108
Implementation boundaries
- The current one-way ANOVA route supports both the classic equal-variance formulation and Welch's one-way ANOVA.
- The classic path reports eta^2 and omega^2. Welch output keeps effect-size fields null in the current engine response.
- The figure pipeline samples up to 20,000 rows before it calls the engine. The API route accepts up to 30,000 rows.
- The figure pipeline skips ANOVA metadata when fewer than 3 groups remain after cleaning and marks that state as
requires_three_or_more_groups. - Dunnett requires
control_group. If the control group is missing or unknown, the API responds with an error. - Games-Howell is accepted in the current route and engine as the unequal-variance pairwise option.
- Licklider does not infer whether each row is the correct independent observation unit. If biological replicates, technical replicates, repeated measurements, or nested units are mixed together without an explicit declaration, the software may still run one-way ANOVA even though the design is not valid for it.
- Licklider does not decide whether a declared control group is scientifically appropriate for Dunnett, or whether your comparison family matches your protocol. Those choices must come from the study design, not from the observed result.
See also
- t-Test - exactly two groups
- Two-Way ANOVA - two categorical factors
- Non-Parametric Alternatives - Kruskal-Wallis and other rank-based alternatives
- Group Comparison - parent overview and test-selection guide