Pseudoreplication Detection

How Licklider detects pseudoreplication in group comparisons, what it asks when repeated observations are present, and how the answer affects the analysis.

Pseudoreplication occurs when observations that appear to be independent actually share a common biological source, and the analysis treats them as if they were independent. It is one of the most common sources of inflated false positive rates in life sciences data.

Common examples:

  • Multiple cells measured from the same animal, with each cell treated as an independent observation
  • Multiple wells measured from the same culture, with each well treated as independent
  • Multiple technical replicates from the same sample, analyzed as if they came from separate biological units

In each case, the effective sample size is the number of independent biological units, not the number of rows.


How Licklider detects it

Licklider looks for two signals when a group comparison is requested:

Cross-group reuse The same subject ID appears in more than one group. This is the strongest signal: if the same biological unit contributed measurements to both the treatment and the control group, those observations are not independent in the usual sense.

Repeated rows within a group The same subject ID appears multiple times within a single group. This pattern suggests technical replication — multiple measurements from the same unit — which are not independent of each other.

The more of these signals are present, and the more consistently they appear, the higher the confidence that pseudoreplication is occurring.

Cross-group reuse is treated as the strongest signal because it directly shows that the same biological unit contributes to the contrast between groups. Repeated rows within a single group are still a serious warning, but they are more likely to reflect either technical replication or a dataset that needs clarification before the analysis can be interpreted.

Licklider cannot detect pseudoreplication if the subject ID is missing, declared incorrectly, reused inconsistently across files, or recorded at the wrong level of the hierarchy. If cells, wells, or technical replicates are exported as if they were already independent rows with no shared parent ID, the check may not fire even though the effective sample size is too small for the analysis being run.


What you are asked to confirm

When repeated observations are detected, Licklider presents a confirmation step with three options:

These are repeated measurements from the same subject The repeated rows are intentional — the same biological unit was measured more than once, and the analysis should account for this structure. Selecting this option records the repeated-measures design and requires an appropriate disclosure.

These are independent samples The repeated rows come from genuinely independent biological units. Select this when subject IDs appear multiple times because each row represents a distinct biological replicate, not a technical repeat.

Note: if the subject ID column in the Data Contract is set and the same ID appears multiple times, selecting independent samples will flag a conflict. The conflict should be resolved before the figure is finalized.

I am not sure yet — show a descriptive figure only The replication structure is unclear. The figure will be generated as a descriptive result and cannot be used in a claim-bearing context until the structure is resolved.


Effect on the analysis

The confirmation determines whether the analysis proceeds as a standard group comparison or as a repeated-measures analysis:

  • Independent samples confirmed — the analysis proceeds as a standard comparison. Pseudoreplication blocking is lifted.
  • Repeated measurements confirmed — the analysis is flagged as repeated-measures. A repeated-measures disclosure is added to the figure. If no repeated-measures model was used, the Inspector will suggest one.
  • Unresolved — for exploratory analyses, the figure remains provisional. For confirmatory and publication-ready analyses, claim-bearing export is blocked until the structure is confirmed.

Confidence levels

The pseudoreplication check is assigned a confidence level based on how strong the signals are:

  • High confidence — the same ID appears in more than one group multiple times
  • Medium confidence — the same ID appears across groups once, or appears multiple times within a group
  • Low confidence — the same ID appears multiple times within a group but not across groups

At lower confidence levels, Licklider surfaces a warning rather than requiring confirmation.

This distinction is deliberate. High-confidence patterns are more likely to represent a genuine independence violation that can change the validity of the inferential result, so they justify an explicit confirmation step. Lower-confidence patterns are still important, but forcing a block on weak heuristics would create avoidable friction for datasets where the repeated-ID pattern has a benign explanation.


Where the result appears

The pseudoreplication check result and your confirmation are visible in the Inspector. If the check is unresolved, the Inspector will indicate that confirmation is required before claim-bearing export is allowed.

Design rationale and references

Licklider treats pseudoreplication as a first-order quality check because counting non-independent rows as if they were separate biological units can make sample size look larger than it really is and drive p-values lower than the design supports.

That is why the check is connected to claim-bearing export rather than shown as a passive note only. If the replication structure is unresolved, the risk is not cosmetic: it affects whether the inferential result is scientifically interpretable.

The confirmation step exists because repeated IDs are informative but not self-explanatory. A repeated pattern can reflect legitimate repeated measurement, true independent units with a naming collision, or a misdeclared subject column. The software can surface the pattern, but the scientific interpretation still depends on the study design and the declared unit of analysis.

References

  1. Hurlbert, S. H. (1984). Pseudoreplication and the design of ecological field experiments. Ecological Monographs, 54(2), 187-211. https://doi.org/10.2307/1942661
  2. Lazic, S. E. (2010). The problem of pseudoreplication in neuroscientific studies: is it affecting your analysis? BMC Neuroscience, 11, 5. https://doi.org/10.1186/1471-2202-11-5

What this page does not cover