Table Shape: Wide vs Long

How to recognize whether your data is wide or long format, which format Licklider expects, and how to convert if needed.

The most common structural problem when bringing experimental data into Licklider is table shape. Many researchers record data in wide format because it reads naturally in a spreadsheet. Some analyses — particularly those involving repeated measurements, timepoints, or conditions encoded as column names — require long format. This page explains the difference and what to do about it so Licklider can correctly interpret the structure you intend, proceed to analysis setup, and generate figures and results on the right footing.


The difference

Wide format

Each observation unit occupies one row. Repeated measurements, timepoints, conditions, or replicates are spread across multiple columns.

Sample | Group | Day1 | Day2 | Day3
A      | CTL   | 1.2  | 1.4  | 1.5
B      | TRT   | 0.9  | 1.1  | 1.3
C      | CTL   | 1.1  | 1.3  | 1.6

This layout is natural to fill in during an experiment and is the default output of many lab instruments and Excel templates. The column names themselves carry meaning — in this case, the timepoint.

Other common wide patterns in life sciences:

Sample | CTL_rep1 | CTL_rep2 | TRT_rep1 | TRT_rep2
Sample | Dose_0   | Dose_10  | Dose_50
Sample | baseline | week4    | week8

In all of these, the group, condition, dose, or timepoint is embedded in the column name rather than stored as a value in its own column.

Long format

Each observation occupies one row. A single value column holds the measurements. Additional columns identify what that measurement belongs to — which sample, which group, which timepoint or condition.

Sample | Group | Timepoint | Value
A      | CTL   | Day1      | 1.2
A      | CTL   | Day2      | 1.4
A      | CTL   | Day3      | 1.5
B      | TRT   | Day1      | 0.9
B      | TRT   | Day2      | 1.1
B      | TRT   | Day3      | 1.3

The same data, restructured. The timepoint is now an explicit column rather than embedded in the column name.


When long format is required

Long format is required when your analysis depends on a variable that is currently encoded as a column name rather than a column value. This applies to:

  • Repeated measures analyses, where timepoint or condition must be a variable in the model
  • Group comparisons where group membership is spread across columns rather than held in a single group column
  • Any analysis where Licklider needs to identify a timepoint column, a condition column, or a replicate structure — and that information is currently embedded in column names

Licklider cannot always infer from column names alone whether a reshaping step matches your intended analysis design. If timepoint, condition, dose, or replicate structure is encoded ambiguously, the wrong reshape can carry the wrong variable definitions downstream into analysis setup, quality checks, figures, and interpretation.

Not all wide data needs converting. If your file has one row per observation and each column represents a distinct variable — age, weight, blood pressure, biomarker concentration — that structure is already correct for most regression and association analyses. The distinction is whether the column names themselves encode a variable you intend to analyze.


How to tell which format you have

Ask these two questions:

1. Are repeated measurements, timepoints, or conditions stored as separate columns? If yes — Day1, Day2, Day3; CTL_rep1, CTL_rep2; Dose_0, Dose_10 — the column names encode a variable. Your data is wide in the sense that matters here.

2. Is there a single column that holds all your measurement values, with separate columns identifying what each value belongs to? If yes, your data is long.

If you are unsure, count your rows relative to your sample size. A dataset with 10 samples measured at 3 timepoints will have 30 rows in long format and 10 rows in wide format.


How to convert

Option A: Convert inside Licklider

Upload your wide-format file. Once in the workspace, open the Data tab and use the Prep panel to add a Wide to Long transformation.

You will specify:

  • ID columns — the columns that identify each observation unit and should be kept as-is (e.g., Sample, Group)
  • Value columns — the columns whose names encode a variable and should be collapsed into a single value column (e.g., Day1, Day2, Day3)

Preview the result before applying. Once applied, the transformation is recorded in the Preprocessing Audit Log with the full definition of what was reshaped and how. Converting to the right long format makes it easier for Licklider to identify analysis variables correctly, run the appropriate setup checks, and produce figures and results that match the design you intend to analyze.

Option B: Convert before uploading

If you prefer to reshape outside Licklider:

  • Excel: Use Power Query (Data → Get & Transform) to unpivot columns
  • R: tidyr::pivot_longer()
  • Python: pandas.DataFrame.melt()

Save the result as a UTF-8 CSV and upload.


What this page does not cover


Design Rationale & References

This page follows a simple design principle: variables that matter to the analysis should be represented as explicit columns, not hidden inside column names. That is why long format is required when timepoint, condition, dose, or replicate structure must be interpreted as analysis variables.

  1. Wickham, H. (2014). Tidy Data. Journal of Statistical Software, 59(10), 1-23. https://doi.org/10.18637/jss.v059.i10
  2. Peng, R. D. (2011). Reproducible Research in Computational Science. Science, 334(6060), 1226-1227. https://doi.org/10.1126/science.1213847