Statistical Validity Score

How Licklider evaluates the statistical validity of a figure across multiple dimensions, and how this determines whether a result can be used for claim-bearing export, and where the current support boundary sits.

Use this page when you want to understand the overall claim-readiness state of a figure across several statistical checks.

This page summarizes how multiple checks combine into an overall assessment; it is not the place to resolve any one issue such as missing data, multiplicity, pseudoreplication, or model mis-specification in detail.

Every figure in Licklider is continuously evaluated against a set of statistical validity checks. These checks determine whether the figure can be used for claim-bearing export — whether the result is eligible to appear in a publication, report, or other claim-bearing output.

This evaluation is not a single number. It is the aggregate result of the individual checks that run throughout the analysis: assumption checks, disclosure requirements, design integrity checks, and model diagnostic guards.

That distinction is intentional. A single score can hide which part of an analysis is strong and which part is weak. Licklider keeps the overall assessment tied to named checks so readers can see what is actually resolved, what is only disclosed, and what still blocks a claim-bearing result.

If you already know the specific problem you need to inspect, go directly to the relevant detailed page instead of treating this page as the canonical guide for that issue.


Quick routing

If your main question is...Go to...Why
Are assumptions such as normality, equal variance, or robustness the main concern?Normality and Homoscedasticity or Assumption and Robustness GuardCovers assumption failures and robustness checks directly
Is the problem about pairing, pseudoreplication, or batch structure?Paired vs Unpaired Guard, Pseudoreplication Detection, or Batch and Plate ConfoundingCovers design integrity rather than overall summary state
Is the issue missing data, multiple comparisons, or repeated analytic choices?Missing Data and Attrition, Multiplicity and Analysis Families, or Outliers and Researcher Degrees of FreedomCovers the specific validity threat rather than the aggregate assessment
Is the issue regression structure, OLS on bounded outcomes, or survival-data handling?Regression Diagnostics Guard, Proportion Data OLS Prevention, or Survival Data Detection GuardCovers model diagnostics and outcome-type fit directly

What the overall assessment reflects

The overall validity assessment is visible in the Assurance panel of the Inspector. The panel shows whether the figure is in one of three states:

Claim-bearing export available

All required disclosures have been confirmed and no assumptions are unresolved. The figure can be used in a publication-ready context.

Provisional

The analysis has been completed, but one or more checks are unresolved. The figure can be viewed and used for exploration, but cannot be exported as a formal claim.

Export blocked

One or more checks require resolution before the figure can be exported in any claim-bearing context. The Inspector shows which checks are blocking export.

The overall validity assessment is therefore a workflow state, not a guarantee that the result is scientifically correct in every respect. It is the current aggregate of the checks Licklider can run from the available metadata, recorded operations, and declared analysis intent.


The dimensions of the evaluation

The following categories of checks contribute to the overall assessment. Each links to a dedicated page where the check is described in detail.

Assumption checks

Whether the statistical assumptions of the selected test are met — normality, equal variance, pairing, independence. → see Normality and Homoscedasticity, Assumption and Robustness Guard

Design integrity

Whether the experimental design is correctly represented — paired vs independent, pseudoreplication, batch confounding. → see Paired vs Unpaired Guard, Pseudoreplication Detection, Batch and Plate Confounding

Reporting completeness

Whether effect size, confidence intervals, and sample size have been disclosed. → see Effect Size, CI, and N Reporting

Multiple comparisons

Whether a multiplicity policy has been declared for confirmatory analyses with multiple claim-bearing figures. → see Multiplicity and Analysis Families

Outlier sensitivity

Whether the conclusion changes when outliers are included or excluded. → see Outlier Sensitivity Report

Missing data

Whether the impact of missing data has been acknowledged. → see Missing Data and Attrition

Researcher degrees of freedom

Whether repeated analytical decisions — exclusion cycling, method changes, subgroup selection — have been tracked and disclosed. → see Outliers and Researcher Degrees of Freedom

Model diagnostics

Whether regression model structure, survival data handling, and outcome type are appropriate. → see Regression Diagnostics Guard, Survival Data Detection Guard, Proportion Data OLS Prevention


How checks interact with export

Each check that is unresolved contributes to a list of requirements that must be addressed before claim-bearing export is allowed. Some checks are blocking — they prevent export entirely until resolved. Others result in automatic disclosures that are included in the export without requiring user confirmation.

The behavior depends on the analysis intent:

  • For exploratory analyses, most checks result in automatic disclosure rather than blocking
  • For confirmatory and publication-ready analyses, unresolved checks block export

The analysis intent is set in the Data Contract → see Outcome Type and Analysis Intent.

This split is deliberate. Exploratory work often needs to continue while the analyst is still learning from the data, so disclosure is usually more useful than hard blocking. Once the same figure is being treated as evidence for a publication or formal claim, unresolved risks should not remain invisible or optional.

Important: Licklider can only evaluate what is exposed through the uploaded data, declared metadata, and recorded workflow history. It does not automatically recover hidden design problems, missing observation-unit information, or analyses performed outside the tracked system. If those are absent from the record, the overall validity assessment can be too permissive.

Design rationale and references

Licklider presents an aggregate result rather than a single opaque score because statistical validity is multi-dimensional. Strong model diagnostics do not cancel unresolved multiplicity; complete disclosure does not rescue pseudoreplication. The overall state therefore summarizes the combined guard outcome while keeping the underlying checks visible to the user [1, 2].

The three-state structure is intentionally simple: claim-bearing export available, provisional, and export blocked. These states are easier for non-specialist readers to act on than a more granular but less interpretable point scale, while still preserving the detail in the linked check pages.

Licklider uses softer disclosure-first behavior for exploratory work and stronger blocking for confirmatory or publication-ready work because unresolved statistical issues carry different consequences depending on the evidential claim being made [1, 3]. Exploration should remain inspectable; claim-bearing output should meet a higher bar.

Methodological foundations

  1. Wasserstein, R. L., & Lazar, N. A. (2016). The ASA statement on p-values: Context, process, and purpose. The American Statistician, 70(2), 129-133. -> Supports evaluating statistical results in the broader context of assumptions, design, and analysis process rather than through a single thresholded indicator.

  2. Simmons, J. P., Nelson, L. D., & Simonsohn, U. (2011). False-positive psychology: Undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological Science, 22(11), 1359-1366. -> Supports keeping distinct validity threats visible instead of compressing them into a score that can obscure unresolved flexibility.

  3. Wasserstein, R. L., Schirm, A. L., & Lazar, N. A. (2019). Moving to a world beyond "p < 0.05". The American Statistician, 73(sup1), 1-19. -> Supports disclosure-oriented interpretation and the idea that claim strength should reflect the full analytic record, not just one number.

Current support boundary

  • The overall validity assessment only aggregates checks that Licklider can actually run from the current figure, metadata, and recorded workflow state.
  • Licklider does not automatically infer missing observation-unit information, hidden clustering, unlogged exploratory analyses, or off-platform decisions that never enter the recorded analysis history.
  • A figure marked Claim-bearing export available means the currently exposed checks are resolved; it does not mean the figure is immune to undisclosed design or data-quality problems outside the available record.
  • A Provisional or Export blocked state is informative about unresolved requirements, but it is not a substitute for reading the underlying check pages to understand the scientific issue.
  • This page is a coordination layer for the linked validity checks, not the canonical source for the mechanics of any individual diagnostic.

What this page does not cover

Each check listed above has its own page where the specific logic, thresholds, and resolution steps are described. This page is a summary of how the checks work together — not a reference for any individual check.