Regression Workflow

Regression models the relationship between an outcome variable and one or more predictors. The right regression method depends on the nature of the outcome: continuous, binary, or non-linear. This page describes how to choose and run a regression analysis in Licklider, what outputs you can expect, and where Licklider adds guardrails before results are used for claim-bearing export.

Choosing a regression method

Licklider starts with the outcome type because the model family determines what quantity is being estimated and how the results should be interpreted. A continuous outcome supports coefficient-based estimation, a binary outcome supports probability and odds-based interpretation, and a sigmoidal assay response calls for a nonlinear curve model rather than a straight-line fit.

The outcome is continuous and the relationship is approximately linear

Use linear regression (OLS). This estimates how the outcome changes with each predictor and reports coefficients, R<sup>2</sup>, p-values, confidence intervals, and a coefficient table.

Request it with: "Run a linear regression of Y on X" or "Regress outcome on predictor and age."

→ see Linear Regression (OLS)

The outcome is binary

Use logistic regression. This estimates the probability of the outcome given the predictors and reports odds ratios, confidence intervals, and AUC.

Request it with: "Run a logistic regression" or "Model response as a function of dose."

→ see Logistic Regression and AUC/ROC

The relationship is sigmoidal (dose-response)

Use non-linear regression with a 4PL, 3PL, or Hill model. This estimates the IC50 and Hill slope, together with the fitted dose-response curve.

Request it with: "Fit a dose-response curve" or "Calculate the IC50."

→ see Dose-response Curves

Running the analysis

Describe the regression in the Chat with the outcome and the predictors:

"Regress gene expression on treatment dose and time point"
"Is age a predictor of response rate?"
"Show the relationship between X and Y with a regression line"

Licklider will identify the relevant columns and fit the appropriate model. This reduces mechanical setup errors at the point of analysis, but it does not replace study-design judgment about whether the chosen predictors should be in the same model.

Reviewing the results

Results appear in the Stats panel of the Inspector. For linear and logistic regression, the panel shows the full coefficient table with standard errors, p-values, and confidence intervals.

For multi-predictor models, the Statistical Results Table in the Table tab provides a downloadable version of the coefficient table.

Depending on the model, the main outputs you can expect on this workflow are:

Linear regression: coefficients, R<sup>2</sup>, p-values, confidence intervals, and a coefficient table
Logistic regression: odds ratios, AUC, p-values, confidence intervals, and a coefficient table
Non-linear dose-response regression: IC50, Hill slope, and the fitted curve

Checking assumptions

For multi-predictor linear regression, Licklider evaluates the predictor structure before claim-bearing export:

Whether the sample size is adequate relative to the number of predictors
Whether predictors are highly correlated with each other
Whether any predictors appear to be duplicates or aliases

If a structural problem is found, the Inspector will indicate what was detected and present options for proceeding.

This check appears before claim-bearing export because unstable predictor structure can make a regression look more convincing than it really is. The goal is to catch common structural problems before coefficients are over-interpreted.

For more detail → see Regression Diagnostics Guard.

Residual plot

After fitting a linear or non-linear regression, you can request a residual plot to check whether the model assumptions are met:

"Show the residual plot"
"Show residuals vs fitted values"

→ see Residual Plot

What Licklider does not detect automatically

Passing the predictor-structure guard does not mean that every important regression risk has been ruled out. In particular, Licklider does not automatically guarantee detection of:

Unmeasured confounding or omitted variables that change the interpretation of an estimated effect
Non-independence such as repeated measures, clustering, or pseudoreplication treated as if all rows were independent
Incorrect causal interpretation of an associational model
Every incorrect functional form, interaction, or influential-point problem after model fitting

Those limits matter because a model can clear the workflow guard and still support a misleading claim. Use this workflow as a structured starting point, not as a substitute for subject-matter judgment about study design and interpretation.

Covariate adjustment disclosure

When a regression includes more than one predictor, Licklider requires that the covariates are acknowledged before confirmatory results can be exported. This confirms that the covariate list will be reported in the methods text.

Licklider asks for this disclosure because the meaning of an adjusted coefficient depends on which other variables were included in the model. Requiring the covariate list to be acknowledged helps prevent confirmatory results from being exported without the model specification being reported.

→ see Confounding Adjustment Disclosure

Design rationale and references

Licklider uses outcome type as the first branching decision because the scientific question and the interpretation of the model output differ across continuous, binary, and sigmoidal outcomes. The workflow also separates predictor-structure checks from later residual inspection so that obvious structural risks can be surfaced before a user moves to claim-bearing interpretation.

The guardrails on predictor count, overlap, and confirmatory disclosure are meant as practical protections for non-specialist users, not as a claim that every regression assumption can be checked automatically from one page.

Babyak, M. A. (2004). What You See May Not Be What You Get: A Brief, Nontechnical Introduction to Overfitting in Regression-Type Models. Psychosomatic Medicine, 66(3), 411-421.
→ Supports warning against overfitting and over-interpretation when model complexity starts to outgrow the available data.
Vittinghoff, E., & McCulloch, C. E. (2007). Relaxing the Rule of Ten Events per Variable in Logistic and Cox Regression. American Journal of Epidemiology, 165(6), 710-718.
→ Supports treating sample-size adequacy as a modeling concern that depends on context rather than on a single universal cutoff.

What this page does not cover

Survival regression → see Survival Workflow
Repeated measures with random effects → see Repeated Measures Workflow
Dose-response curve fitting → see Dose-response Curves