Kaplan-Meier Analysis

Kaplan-Meier analysis estimates the survival function — the probability of surviving past a given time point — from time-to-event data. It accounts for censoring: observations where the event had not occurred by the end of follow-up or when the subject left the study.

When to use Kaplan-Meier analysis

Kaplan-Meier analysis is appropriate when:

The outcome is time to an event (death, relapse, treatment failure, or any other defined endpoint)
Some observations are censored
You want to estimate and compare survival curves across groups

If you want to model the effect of covariates on survival time, Cox proportional hazards regression is more appropriate — see Cox Proportional Hazards Regression.

Data requirements

Your dataset needs at minimum:

A time column — the duration from the start of follow-up to the event or censoring (for example: survival_time, days_to_event, os, pfs)
An event indicator column — a binary variable indicating whether the event occurred (1 or True = event occurred; 0 or False = censored)

Licklider detects survival structure automatically from column names and value patterns. If the detection is incorrect, confirm the column roles in the Chat before running the analysis.

Licklider cannot determine automatically whether the chosen time origin, event definition, or censoring mechanism is scientifically correct for your study. In particular, it cannot infer from the table alone whether censoring is informative, whether a 0/1 column is truly the event indicator of interest, whether delayed entry or competing risks should change the analysis, or whether the recorded time variable reflects the intended clinical endpoint.

These limits matter because a survival curve can look technically valid while still answering the wrong survival question. If the event coding or time origin is wrong, the median survival, number at risk, and log-rank comparison can all be misleading even when the software runs without error.

How to request it

Describe the analysis in the Chat. For example:

"Show a Kaplan-Meier curve"
"Compare survival between the two groups"
"Run a log-rank test"

What the results include

Survival curve

A step function showing the estimated survival probability at each time point, plotted separately for each group. The curve steps down when events occur and holds flat during censored intervals.

Median survival

The time at which the estimated survival probability crosses 0.5 — that is, the time by which half the subjects are expected to have experienced the event. When fewer than half of subjects have experienced the event by the end of follow-up, the median cannot be estimated and is reported as not reached.

Number at risk

The number of subjects still under observation at each time point. This is shown in the Inspector alongside the survival curve and is an important part of any survival analysis report.

The at-risk count is included because later parts of a survival curve are often based on fewer remaining participants. Showing those counts helps you judge how much information is still supporting the tail of the curve.

Log-rank test

When two or more groups are present, the log-rank test evaluates whether the survival curves differ significantly. The test statistic and p-value are reported.

For three or more groups, pairwise log-rank comparisons are run automatically with your choice of multiple comparison correction (Bonferroni, Holm, or Benjamini-Hochberg).

Confidence intervals

Pointwise confidence intervals for the survival curve are not currently shown in the figure.

Because the figure does not currently show pointwise confidence bands, the supporting disclosure elements — especially censoring acknowledgement and number-at-risk reporting — are important for interpreting how much uncertainty and follow-up support remain across the curve.

Required disclosures

Survival analysis results require the following before claim-bearing export is allowed:

Censoring must be acknowledged — the proportion of censored observations must be reported
At-risk counts must be present — the number of subjects at risk at each time point should be shown or reported
Median survival claims — if the median cannot be estimated from the data (fewer than half of subjects experienced the event), this must not be reported as a specific number

These requirements are evaluated by the Survival Data Detection Guard — see Survival Data Detection Guard.

Design Rationale & References

This page follows a simple rule: survival results should show not only whether groups differ, but also whether the data structure supports a responsible interpretation of that difference. That is why Licklider requires a time variable and event indicator, reports median survival only when it is estimable, and treats censoring and at-risk counts as part of the minimum disclosure rather than optional extras.

Kaplan-Meier estimation is used because it is the standard nonparametric way to estimate survival over time when some observations are censored [1]. The log-rank test is used for group comparison because it provides a standard global test of whether survival curves differ across groups under common survival-analysis assumptions.

At-risk counts are emphasized because the visible curve alone can overstate how much information supports later time points. As the number at risk falls, the tail of the curve becomes harder to interpret, so reporting at-risk counts is an important guard against over-reading sparse follow-up segments [2].

Confidence intervals are not currently shown in the figure, so the design relies on disclosure and supporting metadata to keep uncertainty visible until that display is available.

Kaplan, E. L., & Meier, P. (1958). Nonparametric Estimation from Incomplete Observations. Journal of the American Statistical Association, 53(282), 457-481.
Morris, T. P., Jarvis, C. I., Cragg, W., et al. (2019). Proposals on Kaplan-Meier plots in medical research and a survey of stakeholder views: KMunicate. BMJ Open, 9, e030215. https://bmjopen.bmj.com/content/9/9/e030215

What this page does not cover

Modeling survival with covariates — see Cox Proportional Hazards Regression
How survival data is detected automatically — see Survival Data Detection Guard, Outcome Type and Analysis Intent
Kaplan-Meier curve visualization — see Kaplan-Meier Curve