Box Plot
Use this page to understand the current box-plot support for group comparison, including what it summarizes and when raw points still matter.
A box plot summarizes the distribution of a continuous variable across one or more groups. It shows the median, the interquartile range, the range of non-outlier values, and any observations that fall outside that range — in a single compact figure.
Box plots are most useful when you have enough observations per group to represent a distribution meaningfully, and when you want to compare spread and central tendency across groups without committing to a parametric summary.
What Licklider draws
Licklider constructs box plots using the following conventions:
| Element | Definition |
|---|---|
| Center line | Median (Q2) |
| Box edges | Q1 (25th percentile) and Q3 (75th percentile) |
| Whiskers | Extend to the most extreme data point within 1.5 × IQR of the box edges (Tukey's method) |
| Points beyond whiskers | Plotted individually as outlier points |
| Quartile method | Linear interpolation (consistent with Plotly quartilemethod: "linear") |
These definitions are fixed and applied consistently across all box plots in Licklider. The whisker method and quartile algorithm are displayed in the Inspector metadata tab for every box plot.
Important: points beyond the whiskers are outliers only in the box-plot sense of Tukey's rule. They are not automatically treated as measurement errors, exclusion candidates, or biologically invalid observations.
Individual data points By default, only observations that fall outside the whiskers are shown as individual points. You can enable full data overlay — showing every observation — via the Inspector. When the number of observations per group is small, showing all points is strongly recommended.
What the Inspector shows
When a box plot is active, the Inspector displays a summary table with the following statistics for each group:
| Column | Description |
|---|---|
| n | Number of observations |
| Median | Q2 |
| Q1 | 25th percentile |
| Q3 | 75th percentile |
| W. Low | Lower whisker end: minimum value within Q1 − 1.5 × IQR |
| W. High | Upper whisker end: maximum value within Q3 + 1.5 × IQR |
| Outliers | Number of observations beyond the whisker ends |
These values match the figure exactly. They can be used directly when reporting descriptive statistics.
Sample size requirements
Box plots require a minimum number of observations per group to represent the distribution meaningfully.
| n per group | Behavior |
|---|---|
| ≥ 10 | Box plot is admissible |
| 5 – 9 | Admissible with confirmation — Licklider will ask you to acknowledge the limitation before displaying |
| < 5 | Not admissible as a distributional summary — Licklider will recommend an alternative display |
When sample size is insufficient to justify a box plot, Licklider suggests showing individual data points directly. The threshold check runs automatically and appears in the Inspector before the figure is finalized.
These thresholds are product heuristics for chart readability and distribution interpretability, not universal scientific cutoffs. They are intended to reduce cases where the box suggests more distributional certainty than the visible data can support.
When to use a box plot
Box plots work well when:
- You have 10 or more observations per group
- You want to compare medians and spread across groups without assuming normality
- Your data may contain outliers that are worth making visible
- You are in an exploratory phase and want a distributional overview
Consider alternatives when:
- n per group is small (< 10) — use a strip plot or dot plot so individual observations are visible
- You want to show mean ± SEM or mean ± SD for a parametric summary — use the Group Comparison Mean and SEM figure
- You want to show the full distribution shape — use a violin plot
- You want to combine distribution and individual points in one figure — use a raincloud plot
Licklider cannot determine from the box alone whether a distribution is multimodal, strongly discrete, or dominated by a small number of points in a way that would mislead a reader. In those situations, raw points or a different figure type may be more honest than a compact summary.
Reporting
When reporting results based on a box plot, describe the summary statistics explicitly rather than referring to the figure alone. The Inspector summary table provides all values needed.
Recommended reporting format:
Data are presented as median [Q1, Q3]; whiskers extend to the most extreme values within 1.5 × IQR (Tukey's method). Observations beyond the whiskers are plotted individually.
Licklider generates a methods text draft that includes this language automatically.
Design rationale and references
Licklider fixes the whisker rule at 1.5 × IQR and the quartile algorithm at linear interpolation so that box plots are computed consistently across figures and exports. That consistency matters because small implementation differences can change which points appear outside the whiskers and how quartiles are reported.
Licklider also recommends showing all points when group sizes are small because compact summaries can hide sparse or irregular data. For small samples, readers often need to see the raw observations to judge whether the box is a faithful summary or whether a strip plot or dot plot would communicate the data more directly.
The n thresholds on this page are therefore not claims about statistical validity on their own. They are display heuristics meant to reduce visually misleading uses of box plots, especially when too few points are present to make quartiles and whiskers feel stable as a visual summary.
References
- Tukey, J. W. (1977). Exploratory Data Analysis. Addison-Wesley.
- Weissgerber, T. L., Milic, N. M., Winham, S. J., & Garovic, V. D. (2015). Beyond bar and line graphs: time for a new data presentation paradigm. PLOS Biology, 13(4), e1002128. https://doi.org/10.1371/journal.pbio.1002128
What this page does not cover
- Statistical comparison between groups — see t-Test, One-Way ANOVA and Post Hoc, Non-parametric Alternatives
- Showing individual data points as a primary figure — see Strip Plot, Dot Plot
- Combining distribution and raw data — see Raincloud Plot
- Assumption checks for group comparison analyses — see Normality and Homoscedasticity