Linear Regression (OLS)
How to run ordinary least squares regression in Licklider, what the results include, and how the model is visualized.
Linear regression models the relationship between a continuous outcome variable and one or more predictor variables. Licklider fits ordinary least squares (OLS) regression using a standard implementation and returns the model coefficients, standard errors, p-values, confidence intervals, and fit statistics.
When to use linear regression
Linear regression is appropriate when:
- The outcome variable is continuous and unbounded
- You want to estimate how the outcome changes with one or more predictors
- The relationship between predictors and outcome is approximately linear
If the outcome is binary or a proportion, logistic regression is more appropriate → see Logistic Regression and AUC/ROC.
If the relationship is non-linear — for example, a sigmoidal dose-response curve — non-linear regression is more appropriate → see Non-linear Regression and IC50/4PL.
How to request it
Describe the analysis in the Chat. For example:
- "Run a linear regression of body weight on dose"
- "Regress gene expression on treatment intensity and age"
- "Show the relationship between X and Y with a regression line"
Licklider will fit the model and display the results.
What the results include
Coefficient table
One row per predictor, including the intercept. Each row shows:
- Estimate — the regression coefficient
- Standard error
- t-statistic
- p-value
- 95% confidence interval (lower and upper bounds)
Model fit statistics
- R² — the proportion of variance in the outcome explained by the model
- Adjusted R² — R² penalized for the number of predictors
- F-statistic and its p-value — the overall test of whether any predictor explains the outcome
- Residual standard error
Correlation
When a linear regression is run on a scatter or regression chart, Licklider automatically calculates both Pearson and Spearman correlation coefficients. The results appear in the Correlation panel of the Inspector, alongside the regression output.
The primary correlation is selected based on the normality of the regression residuals:
- If residuals are normal (Shapiro-Wilk p > 0.05): Pearson is primary
- If residuals are non-normal: Spearman is primary
Both coefficients are always reported. The primary designation indicates which is statistically appropriate given the data. For a full discussion, see Correlation Analysis.
Visualization
When a linear regression is run on a two-variable scatter plot, Licklider automatically overlays:
- The fitted regression line
- A 95% confidence band around the mean response
The confidence band reflects uncertainty in the estimated mean, not the spread of individual observations around the line.
This band is intentionally a confidence band for the estimated mean response rather than a prediction interval for individual future observations. That choice keeps the default figure aligned with the fitted regression line itself: it shows how uncertain the estimated mean trend is, without implying that the band represents the full spread of individual points.
Multiple predictors
Linear regression with more than one predictor is supported. Each predictor's coefficient represents its estimated effect on the outcome holding all other predictors constant.
When multiple predictors are included, Licklider evaluates the predictor structure for potential issues — including collinearity and sample size adequacy — before allowing claim-bearing output. This guard is meant to catch common structural problems that make regression coefficients unstable or hard to interpret, especially when the model is too complex for the available sample or when predictors overlap heavily in what they measure. For more detail → see Regression Diagnostics Guard.
The guard does not certify that the model is fully valid. It focuses on predictor structure, not on whether the relationship is truly linear, whether residual variance is constant, whether influential points dominate the fit, or whether clustered observations violate independence.
Assumptions
Linear regression assumes:
- The outcome variable is continuous
- The relationship between predictors and outcome is linear
- Residuals are approximately normally distributed
- Residuals have roughly constant variance (homoscedasticity)
- Observations are independent
Licklider checks normality automatically for single-predictor models. For multi-predictor models, the Regression Diagnostics Guard evaluates predictor structure. Residual diagnostics should be inspected separately when the assumptions are in question.
These checks reduce common mistakes, but they do not validate your study design for you. In particular, Licklider does not automatically determine whether rows that look separate are actually repeated measurements from the same subject, animal, plate, well, batch, or cluster. If that structure is hidden in the table, ordinary OLS can report coefficients, standard errors, and p-values that look more certain than they should.
Licklider also does not automatically prove that the predictor-outcome relationship is linear, that residuals are well-behaved across the full range of fitted values, or that the result is not being driven by a small number of influential points. Those are model-checking questions, not guarantees of the basic OLS fit.
If your data are clustered, repeatedly measured, strongly non-linear, or visibly heteroscedastic, pause before interpreting the OLS result and review Repeated Measures and Mixed Models, the Regression Diagnostics Guard, and the relevant diagnostic plots.
Design rationale and references
Licklider shows coefficients, confidence intervals, and fit statistics because regression is usually used to support directional scientific claims, not just to summarize association. Reporting uncertainty around each coefficient helps readers judge both magnitude and precision rather than focusing on p-values alone.
Licklider also computes Pearson and Spearman correlations alongside the regression output as descriptive companions, not as replacements for the fitted model. This gives new users a quick read on simple association while keeping the regression coefficient table as the main inferential result.
For multi-predictor models, Licklider separates predictor-structure checks from residual diagnostics because overfitting and collinearity are common failure modes before interpretation begins, especially when the model includes many overlapping predictors relative to sample size. That is why the guard can block claim-bearing output for structural problems while still leaving residual diagnostics as a separate interpretive step.
Exact p-values are reported as numeric values rather than threshold labels so readers can interpret evidence in context rather than treating a cutoff as a binary pass-fail rule.
References
- Babyak, M. A. (2004). What you see may not be what you get: A brief, nontechnical introduction to overfitting in regression-type models. Psychosomatic Medicine, 66(3), 411-421. https://doi.org/10.1097/01.psy.0000127692.23278.a9
- Dormann, C. F., Elith, J., Bacher, S., et al. (2013). Collinearity: a review of methods to deal with it and a simulation study evaluating their performance. Ecography, 36(1), 27-46. https://doi.org/10.1111/j.1600-0587.2012.07348.x
- Wasserstein, R. L., & Lazar, N. A. (2016). The ASA statement on p-values: Context, process, and purpose. The American Statistician, 70(2), 129-133. https://doi.org/10.1080/00031305.2016.1154108
What this page does not cover
- Binary or bounded outcome regression → see Logistic Regression and AUC/ROC
- Non-linear curve fitting → see Non-linear Regression and IC50/4PL
- Repeated measures or clustered data → see Repeated Measures and Mixed Models
- Predictor structure quality checks → see Regression Diagnostics Guard
- Full residual diagnostics and influence analysis → see the diagnostic plot pages linked from Regression Diagnostics Guard