Violin Plot

When to use a violin plot, what it shows, and how Licklider decides whether your data supports one.

A violin plot shows the full distribution of values within each group as a mirrored density curve. Unlike a box plot, which summarizes the distribution with five numbers, a violin plot reveals the shape — including whether the distribution is unimodal, skewed, or has multiple peaks.


When to use a violin plot

Violin plots work best when:

  • Each group has at least 10 observations
  • The distribution shape is meaningful to communicate (for example, when comparing a bimodal distribution to a unimodal one)
  • The data is continuous and not heavily discrete

For smaller samples, a strip plot shows individual points more clearly. For publication-ready figures where distribution shape is less important, a box plot or group comparison mean figure may be more appropriate.


What the figure shows

Each violin displays:

  • The density shape — wider sections indicate more observations at that value
  • An inner box — showing the interquartile range and median (the horizontal line inside the box)
  • A mean line — indicating the group mean
  • Individual points — overlaid with jitter when the option is enabled

Important: the smooth outer shape is a kernel density estimate, not a literal outline of observed frequencies. It can suggest peaks, shoulders, or gaps that are useful as visual clues, but those features still need to be read in light of sample size and the raw data.


When Licklider suggests a violin plot

Licklider recommends a violin plot when the dataset is large and dense — roughly 30 or more observations per group and 200 or more rows total — and the distribution is unimodal and continuous.

These thresholds are product heuristics for when a smooth density summary is likely to be visually informative rather than misleading. They are not universal scientific cutoffs, and a dataset that passes them can still need a different figure if the raw values are sparse, discrete, or strongly clustered.

Violin plots are not recommended when:

  • Multimodality is suspected — when the data shows evidence of multiple peaks, a histogram may communicate the shape better
  • The data is discrete or integer-like — KDE on discrete values produces a misleading smooth curve
  • Outlier fraction is high — extreme values distort the KDE shape
  • There are many groups — six or more groups make violin plots difficult to read
  • The sample is small — fewer than 8 observations per group triggers a quality check

Quality checks

When a violin plot is selected, Licklider evaluates whether the data supports it:

  • Fewer than 8 observations per group: the figure is flagged as exploratory only
  • Between 8 and 19 observations per group: the figure requires confirmation before use in a claim-bearing context
  • 20 or more observations per group: no restriction

Similar thresholds apply to the proportion of zero values and the fraction of potential outliers within each group.

When the data does not clearly support a violin plot, Licklider presents an option to switch to a box plot with individual points overlaid.

These checks reduce common visual failure modes, but they do not guarantee that the KDE shape is scientifically appropriate. Licklider cannot fully determine whether an apparent peak is a stable feature of the distribution or an artifact of small n, discreteness, heavy tails, or a few influential observations. In those situations, showing raw points or switching to a box plot may be more honest than relying on a smooth density silhouette.

Design rationale and references

Licklider treats violin plots as shape-first figures. That is why they are recommended only when there are enough data for the density estimate to add information beyond a box plot or strip plot. When the data are small, sparse, or strongly discrete, a smooth violin can look more certain than the observed values justify.

The threshold bands on this page are therefore display heuristics, not claims that a sample below a cutoff is never analyzable. Their purpose is to reduce cases where a kernel density estimate makes the distribution look richer, smoother, or more stable than it really is.

Licklider also suggests switching to a box plot with points when the data do not clearly support a violin because that alternative preserves a compact summary while exposing more of the raw observations that the KDE might otherwise blur.

References

  1. Hintze, J. L., & Nelson, R. D. (1998). Violin plots: A box plot-density trace synergism. The American Statistician, 52(2), 181-184. https://doi.org/10.1080/00031305.1998.10480559
  2. Weissgerber, T. L., Milic, N. M., Winham, S. J., & Garovic, V. D. (2015). Beyond bar and line graphs: time for a new data presentation paradigm. PLOS Biology, 13(4), e1002128. https://doi.org/10.1371/journal.pbio.1002128

What this page does not cover