Fisher's Exact Test

When to use Fisher's exact test instead of chi-square, how it works, how to interpret the odds ratio, and where the current support boundary sits.

Fisher's exact test evaluates whether two categorical variables are associated, using the exact probability distribution rather than a chi-square approximation. It is most appropriate when expected cell counts are small, situations where the chi-square approximation becomes unreliable. Licklider supports both the classic 2×2 form and the Freeman-Halton extension for general R×C tables.

Note Fisher's exact test runs automatically alongside chi-square whenever Licklider detects a contingency table with at least two row levels and two column levels. The 2×2 form returns an odds ratio and confidence interval; the R×C form (3×2, 2×3, 3×3, ...) returns a single two-sided p-value computed under the Freeman-Halton conditional distribution. Both shapes appear in the Inspector alongside chi-square.

When to use Fisher's exact test

Fisher's exact test is appropriate when:

  • The contingency table is 2×2 or larger R×C
  • One or more cells have an expected count below 5
  • The sample size is small

For 2×2 tables, Licklider runs the classical Fisher's exact test and reports an odds ratio with a 95% confidence interval. For larger tables it switches to the Freeman-Halton extension. The Freeman-Halton test conditions on the observed row and column margins in the same way as the 2×2 form, but does not yield a single odds ratio because the association can have more than one direction.

For large samples with adequate expected counts, the chi-square test and Fisher's exact test give very similar results. Fisher's exact test is computationally exact regardless of sample size, but chi-square is the standard choice when expected counts are adequate.

The usual rule of thumb is that Fisher becomes especially valuable when one or more expected counts fall below 5. That threshold is a practical warning sign rather than a law of nature, but it is widely used because the chi-square approximation becomes less reliable as tables become sparse.

How it runs in Licklider

When a contingency table is analyzed, Licklider runs both the chi-square test and Fisher's exact test automatically. Both results appear in the Inspector so you can compare them directly.

Fisher's exact test is not triggered by specifying it directly. It runs whenever the input data form a contingency table with at least two row levels and two column levels and chi-square is the active analysis path. The 2×2 case is dispatched to the classical Fisher endpoint, and tables larger than 2×2 are dispatched to the Freeman-Halton R×C endpoint.

For R×C tables Licklider uses a hybrid algorithm:

  • Exact enumeration of the Freeman-Halton conditional distribution when the table is small (total N at most 200 and at most nine cells).
  • Monte Carlo simulation of the multivariate hypergeometric distribution otherwise. The default is 10<sup>6</sup> replicates and the seed is recorded in the audit payload so the run can be reproduced.

If the chi-square result shows a warning about small expected counts, the Fisher result provides a more reliable p-value for the same hypothesis. Showing both results is intentional: when the approximate and exact results start to diverge, that is a sign that sparse counts are affecting the approximation and the exact result should carry more weight.

Important: Licklider does not automatically detect whether the rows are truly independent, whether a 2×2 table is actually a paired design, or whether a one-sided direction was chosen only after looking at the data. If the same subject appears more than once, or if the table comes from matched before/after outcomes, Fisher's exact test for independent samples is not the right safeguard.

Tables larger than 2×2 (Freeman-Halton)

For contingency tables with more than two rows or columns (e.g., 3×2, 2×3, 3×3), Licklider uses the Freeman-Halton extension of Fisher's exact test.

This test computes the exact probability of obtaining a table as extreme as or more extreme than the observed table, given the fixed row and column margins. It is the natural generalization of Fisher's exact test to R×C tables.

For small tables (N at most 200 and at most nine cells), the p-value is computed exactly using enumeration. For larger tables, Licklider uses Monte Carlo simulation with up to 1,000,000 replicates and a fixed random seed for reproducibility. The seed source (client-supplied, derived from the request id, or default) is recorded in the audit payload.

Differences from 2×2

Feature2×2R×C
p-valueExactExact or Monte Carlo
Odds ratioYesNot defined
Confidence intervalYes (Cornfield)Not available
One-sided testYesNot available (two-sided only)
Alternative hypothesistwo-sided, less, greatertwo-sided only

When to use R×C Fisher

The Freeman-Halton test is most useful when the expected cell counts in your table are small (below 5), making the chi-square approximation unreliable. For tables where all expected counts are above 5, the chi-square test is generally adequate.

Licklider runs both the chi-square test and Fisher's exact test (or Freeman-Halton for R×C) whenever a contingency table is analyzed. Both results appear in the Inspector panel.

What the Inspector shows

For 2×2 tables, the Inspector includes:

  • The odds ratio
  • A p-value
  • A 95% confidence interval for the odds ratio
  • The observed and expected counts
  • A transparency block showing the requested alternative, executed alternative, and engine identifier
  • A note if McNemar's test would be more appropriate for a paired 2×2 design

For R×C tables, the Inspector instead shows:

  • A two-sided Freeman-Halton p-value
  • The observed table shape, total N, and counts
  • The computation method (exact enumeration or Monte Carlo) and the algorithm note
  • For Monte Carlo runs, the number of replicates and the seed used, together with the seed source (client-supplied, derived from the request id, or default)
  • The engine identifier and rules version

Odds ratios and confidence intervals are intentionally omitted for R×C results because a single odds ratio is not well defined when either variable has more than two levels.

Odds ratio and table orientation

The odds ratio measures the strength of association between the two variables. An odds ratio of 1 indicates no association. An odds ratio greater than 1 indicates that the outcome is more likely in one group; an odds ratio less than 1 indicates it is less likely.

The direction of the odds ratio depends on how the 2×2 table is oriented. Swapping the row order or the column order inverts the odds ratio, so you should read it together with the displayed table rather than as a context-free number.

Odds ratios are reported because the p-value tells you whether the table is surprising under the null, but not how large the association is. The odds ratio gives that magnitude and direction.

If one cell is zero or very close to zero, the odds ratio can become extremely large, extremely small, or unstable to tiny data changes. In that setting, focus on the raw counts and practical context as well as the reported statistic.

Zero-cell tables

When one or more cells in the 2×2 table contain zero, the odds ratio may be 0.0 or may become difficult to interpret directly, and the confidence-interval lower bound can be 0.0. This is a correct result, not an error.

Zero-cell tables are often sparse enough that small changes in the data can substantially change the odds ratio. In those cases, read the result conservatively and keep the raw table visible in your interpretation.

For guidance on interpreting zero-cell results, see Known Limitations.

One-sided testing

By default, Fisher's exact test is two-sided: it tests whether the association goes in either direction. For 2×2 tables, a one-sided test can be requested when the direction of the association was specified in advance. The Freeman-Halton form for R×C tables is two-sided only, because a one-sided alternative is not well defined once either variable has more than two levels.

If you have declared a hypothesis direction before running the analysis, Licklider resolves the sidedness automatically:

  • A declared directional increase uses alternative="greater"
  • A declared directional decrease uses alternative="less"
  • An undeclared or two-sided hypothesis uses alternative="two-sided"

One-sided testing should only be used when the direction was pre-specified. For more detail, see Hypothesis Direction and Sidedness.

Licklider cannot determine from the dataset whether that directional hypothesis was truly specified in advance. That decision has to come from your protocol, not from the observed table.

In the API surface, one-sided requests should carry an explicit preregistration declaration. If that declaration is absent, the Fisher route rejects the request instead of silently running a one-sided test.

The Inspector also carries a small transparency payload in the Fisher result. That payload records the requested alternative, the executed alternative, and the engine identifier scipy.stats.fisher_exact so it is clear whether the result came from the default figure flow or from a direct API request.

Design rationale and references

Licklider shows Fisher's exact test alongside chi-square for 2×2 tables so readers can compare the exact and approximate answers on the same data. This is especially useful when expected counts are small, because the chi-square result may drift away from the exact calculation in sparse tables [1, 2].

For larger tables Licklider runs the Freeman-Halton extension of Fisher's exact test [4]. The Freeman-Halton form keeps the conditional exact logic while extending it to general R×C tables. The hybrid algorithm switches between exact enumeration for small tables and Monte Carlo simulation for larger ones so that the calculation remains tractable even when the row and column margins make exact enumeration prohibitively expensive [3, 4].

The odds ratio is emphasized because hypothesis tests answer whether an association is compatible with chance under the null, while the odds ratio describes the direction and magnitude of the association itself [3].

Methodological foundations

  1. Fisher, R. A. (1922). On the interpretation of chi-square from contingency tables, and the calculation of P. Journal of the Royal Statistical Society, 85(1), 87-94. -> Original derivation of the exact conditional approach used for 2×2 contingency tables.

  2. Campbell, I. (2007). Chi-squared and Fisher-Irwin tests of two-by-two tables with small sample recommendations. Statistics in Medicine, 26(19), 3661-3675. -> Explains why exact methods are preferred when 2×2 tables are sparse and why approximation-based tests become less trustworthy in that setting.

  3. Agresti, A. (2002). Categorical Data Analysis (2nd ed.). Wiley. -> Standard reference for interpreting odds ratios, exact tests, and the role of table orientation in categorical association analysis.

  4. Freeman, G. H., & Halton, J. H. (1951). Note on an exact treatment of contingency, goodness of fit and other problems of significance. Biometrika, 38(1/2), 141-149. -> Original derivation of the exact conditional p-value for general R×C contingency tables that Licklider implements as the Freeman-Halton extension.

Current limitations

  • R×C tables use two-sided tests only; one-sided alternatives are not defined for tables larger than 2×2.
  • Odds ratio and confidence intervals are not available for tables larger than 2×2.
  • For large tables, the Monte Carlo estimate has inherent variability (with up to 1,000,000 replicates this is negligible for practical purposes, but it is not bit-for-bit deterministic across changes to the seed source).
  • Boschloo's test and mid-p Fisher are not yet available.

Current support boundary

  • Fisher's exact test is supported for 2×2 tables (classical form) and for R×C tables larger than 2×2 (Freeman-Halton extension).
  • Fisher's exact test runs automatically as part of the chi-square workflow and is not available as a standalone method selection.
  • Direct selection of Fisher's exact test without chi-square is not currently supported.
  • Licklider does not automatically detect pseudoreplication, repeated measurements, hidden clustering, or matched-pair structure that would make McNemar or another design-specific method more appropriate.
  • Licklider does not automatically determine whether a one-sided alternative was genuinely pre-specified before the data were seen.
  • Licklider does not automatically decide which row and column ordering is the scientifically preferred one for interpreting the odds ratio.

What this page does not cover