Outlier and Exclusion Policy
How to define and document your outlier handling approach before analysis, and how Licklider records every exclusion decision for disclosure and sensitivity evaluation.
Decisions about how to handle outliers should be made before you look at the results — not after. When exclusion decisions are made post-hoc, the choice of which observations to remove can unconsciously favor a particular outcome. This is one of the most common sources of researcher degrees of freedom in life sciences data.
Licklider requires you to declare the criterion for any exclusion before the action is applied. Every exclusion is then recorded and disclosed, and the sensitivity of your results to that exclusion is evaluated automatically.
Four ways to handle an outlier
Licklider provides four distinct actions for any observation identified as a potential outlier:
Exclude Remove the observation from the dataset. The row is not used in any subsequent analysis. The exclusion criterion, the number of excluded rows, and the group-level breakdown are recorded in the Outlier Exclusion Log.
Keep Explicitly retain the observation. This is useful when an observation is flagged by the automatic detection but has a known biological explanation — for example, a genuine extreme responder or an expected outlier based on protocol. Keeping a flagged observation is also recorded.
Winsorize Replace the extreme value with the boundary value at the detection threshold, rather than removing the row entirely. The winsorized cell count per group is recorded. This approach retains sample size while limiting the influence of extreme values.
Manual override Override the automatic detection decision for a specific row, regardless of whether it was flagged. You can manually exclude a row that was not flagged by the criterion, or manually keep a row that was flagged. Each manual override requires a written justification that is recorded with the row ID.
Declaring an exclusion criterion
Before any exclusion or winsorization is applied, Licklider requires you to declare the criterion.
IQR-based criterion The default method. An observation is flagged if its value falls below Q1 - k × IQR or above Q3 + k × IQR, where k is the IQR multiplier you specify. The default multiplier is 1.5 (Tukey's method). You can adjust this value — for example, using 3.0 for a more conservative threshold.
The multiplier is recorded alongside every exclusion that uses it, so the exact criterion applied is always traceable.
Manual justification When the exclusion is not based on a statistical criterion — for example, when removing rows with a known data collection error or protocol violation — you select the manual criterion type and provide a written justification. This justification is attached to the exclusion record.
What Licklider can and cannot determine automatically
Licklider can flag extreme values under the declared criterion, preview which rows would be affected, and record every exclusion or override that you apply.
However, Licklider cannot determine automatically whether an extreme value is a measurement error, a biologically meaningful extreme, or a design-related artifact. In particular, Licklider cannot reliably determine:
- Whether an extreme value reflects instrument failure, sample contamination, or a genuine responder
- Whether an apparent outlier is actually expected under the protocol, time structure, batch structure, or replicate design
- Whether excluding a row is scientifically preferable to keeping it and reporting the uncertainty
- Whether a manual override reflects a pre-specified rule or a post-hoc attempt to improve the result
These limits matter because an unjustified exclusion can change sample size, effect estimates, and p-values in a way that makes a result look more stable than it really is. That is why Licklider records the criterion, requires written justification for manual overrides, and evaluates sensitivity after the policy is applied.
Where to apply the policy
Outlier and exclusion actions are applied in the Prep panel of the dataset view, before any analysis or figure is generated.
Steps:
- Open the Prep panel for your dataset
- Select the action type (Exclude, Keep, Winsorize, or Manual Override)
- Select the columns to apply the action to
- Declare the criterion (IQR multiplier or manual justification)
- Preview the result before applying
The preview shows which rows would be affected before the action is committed. Once applied, the action is recorded in the Prep Run Report.
What gets recorded
Every outlier action applied to a dataset is recorded in the Outlier Exclusion Log, which includes:
- The action type (exclude, keep, winsorize, manual override)
- The criterion type and multiplier used
- The number of flagged rows per group
- Sample row IDs of flagged observations
- Any manual overrides, with their row IDs, the automatic detection result that was overridden, and the written justification
This log is attached to every figure generated from the dataset. It is visible in the Inspector and included in the export disclosure.
What you will see in practice
Outlier handling is not a hidden preprocessing step. In practice, you should expect to see:
- A preview of which rows would be affected before the action is applied
- A recorded action history in the Prep Run Report
- An Outlier Exclusion Log attached to downstream figures
- Any manual overrides, together with their written justifications
- Sensitivity warnings in the Inspector when the result changes materially across analysis variants
These outputs help you judge not only what was removed or changed, but also whether the substantive conclusion still holds after the policy is applied.
Manual override
Manual overrides allow you to make row-level decisions that differ from the automatic criterion. They are appropriate when you have a scientific reason that the criterion alone cannot capture.
Each override requires:
- The specific row or rows to override
- The decision (exclude or keep)
- A written justification
The justification is not for Licklider — it is for your own record, for collaborators, and for reviewers who may ask why a specific observation was treated differently from the others.
Manual overrides are protected: a row marked as keep via manual override will not be removed by a subsequent automatic exclusion action.
What happens after exclusion
After an exclusion policy is applied and figures are generated, Licklider automatically evaluates whether the results are sensitive to the exclusion. It compares results from three variants:
- Raw: the dataset before any outlier processing
- Processed: the dataset after exclusions
- Robust: an alternative analysis method less sensitive to extreme values
If the conclusions change materially between variants — for example, a significant result becomes non-significant after removing outliers — this is flagged in the Inspector as a sensitivity warning.
In practice, this means the Inspector can show that a conclusion is stable across raw, processed, and robust variants, or that it depends heavily on the exclusion decision. When sensitivity is flagged, the exclusion policy and the affected result remain part of the disclosure record rather than being hidden from the final output.
This evaluation runs in the background and does not require additional action from you. For more detail on interpreting the sensitivity results — see Outlier Sensitivity Report.
Design Rationale & References
This page follows a simple rule: outlier handling should be declared before it can influence an inferential result. That is why Licklider asks for an explicit criterion before exclusion or winsorization, records every action in a durable log, and evaluates whether the conclusion changes after the policy is applied.
The IQR-based rule is the default because it is a familiar, distribution-light screening rule that can be stated clearly and traced exactly once the multiplier is recorded [1]. It is not treated as proof that a point is erroneous; it is a transparent starting criterion that can be reviewed and, if needed, overridden with justification.
Winsorization is offered separately from exclusion because some analyses need a way to reduce the influence of extreme values without removing observations entirely. The raw, processed, and robust comparison is intentional for the same reason: it helps reveal when the scientific conclusion depends heavily on a particular handling choice rather than remaining stable across reasonable alternatives [2].
Manual overrides require written justification because the main risk in outlier handling is not only the mathematics of the threshold, but also undisclosed researcher discretion after seeing the result. Recording that discretion makes the decision auditable rather than invisible.
- Tukey, J. W. (1977). Exploratory Data Analysis. Addison-Wesley.
- Wilcox, R. R. (2017). Introduction to Robust Estimation and Hypothesis Testing. Academic Press.
What this page does not cover
- Reading and interpreting the Outlier Exclusion Log — see Outlier Exclusion Log
- Sensitivity analysis of exclusion decisions — see Outlier Sensitivity Report
- Researcher degrees of freedom and p-hacking risk — see Outliers and Researcher Degrees of Freedom
- N reporting after exclusions — see N Disclosure and Attrition Trail