Effect Size, CI, and N Reporting

What effect sizes, confidence intervals, and sample sizes Licklider reports, and when their disclosure is required for claim-bearing export.

Statistical significance tells you whether an observed difference is likely to be real. Effect size, confidence intervals, and sample size tell you whether it matters, how precisely it is estimated, and how much data it is based on. All three are expected in publications and are required by Licklider for claim-bearing export when inferential results are present.


Effect size

Effect size quantifies the magnitude of a difference or relationship, independent of sample size. Licklider reports effect size alongside statistical test results when it can be computed from the available metadata.

The effect size reported depends on the analysis:

AnalysisEffect size reported
t-test (Welch, Student, paired)Hedges' g
Mann-Whitney U / WilcoxonRank-biserial r
Linear regressionR² (model), r (per predictor)
Logistic regressionAUC (model)

Effect sizes are included in the Statistical Results Table and in the figure Inspector.


Confidence intervals

Confidence intervals show the range of values within which the true parameter is estimated to fall at a given confidence level. Licklider uses 95% confidence intervals throughout.

For group comparisons, the CI is shown for the mean difference. For regression, the CI is shown for each coefficient. For logistic regression, CI is shown for each odds ratio.

When CI data is present in the figure metadata, Licklider confirms that it has been disclosed before allowing claim-bearing export.


Sample size (N)

Sample size is the number of observations used in the analysis — the analysis N, after any exclusions or missing data handling. This is distinct from the input N (the number of rows in the original dataset) and the biological N (the number of independent biological units).

Licklider reports analysis N in the figure Inspector and includes it in the Statistical Results Table.

For sample attrition — how the input N became the analysis N — see N Disclosure and Attrition Trail.

Licklider can report analysis N and, where supported by the data contract, biological N. However, it cannot guarantee that the biologically independent unit has been declared correctly. If subject IDs, replicate structure, or nesting are wrong or incomplete, the reported N values may still look precise while reflecting the wrong unit of analysis.


When disclosure is required

When a figure contains inferential results (p-values, test statistics, or model coefficients), Licklider evaluates whether effect size, CI, and sample size information is present and has been disclosed. If any of these is missing or unacknowledged, claim-bearing export is blocked for confirmatory and publication-ready analyses.

The Inspector will indicate which elements are unresolved.

This check confirms presence and disclosure, not scientific sufficiency by itself. Licklider can verify that an effect size, confidence interval, or N value exists in the reporting surface, but it cannot decide on its own whether the chosen metric is the best one for your field, whether the interval is interpreted correctly, or whether the underlying unit of analysis is scientifically valid.

Design rationale and references

Licklider requires effect size, confidence intervals, and N together for claim-bearing export because p-values alone do not tell a reader whether an observed result is large enough to matter, precise enough to trust, or based on enough independent data to support the stated claim.

The separation between analysis N and biological N is also deliberate. In many life-science datasets, the number of rows is not the same as the number of independent biological units. Keeping these quantities visible reduces the risk that technical replicates, repeated measures, or nested observations are silently reported as if they were independent samples.

This page therefore treats disclosure as a minimum reporting gate, not as a guarantee that interpretation is complete. A figure that reports effect size, CI, and N is better specified than one that omits them, but readers still need to judge whether the metric, interval, and unit of analysis fit the scientific question.

References

  1. Sullivan, G. M., & Feinn, R. (2012). Using Effect Size - or Why the P Value Is Not Enough. Journal of Graduate Medical Education, 4(3), 279-282. https://doi.org/10.4300/JGME-D-12-00156.1
  2. Wasserstein, R. L., & Lazar, N. A. (2016). The ASA's Statement on p-Values: Context, Process, and Purpose. The American Statistician, 70(2), 129-133. https://doi.org/10.1080/00031305.2016.1154108

What this page does not cover