Formulas for the Kaplan–Meier estimator
Let represent the distinct event times. For each event time , let be the size of the risk set (the number of surviving observations) just prior to .
Survival data is expected when
outcome.type = "survival". Let
be the number of failures at
.
The Kaplan–Meier estimator of the survival function
is as below.
Kaplan–Meier estimator
Note that the estimator is defined to be right-continuous, so the events at are included in the estimate of .
The variance or standard error of the Kaplan–Meier estimator is often calculated with the Greenwood formula. This formula is derived from a binomial argument, so extension to the weighted case is ad hoc. Alternatively, Tsiatis (1981) proposes a slightly different formula based on a counting process argument which includes the weighted case.
Greenwood variance
Tsiatis variance
Suppose that the survival data consists of , independent sample of right-censored survival data with weights (). Let and be the weighted number of failures and the weighted number at risk, respectively, at time . The weighted Kaplan–Meier estimator of the survival function is
Xie and Liu (2005) proposed the Greenwood-type variance for the weighted Kaplan–Meier estimator.
Greenwood variance for weighted Kaplan–Meier where is an adjustment factor defined as The Tsiatis-type variance is calculated as follows in the same spirits.
Tsiatis variance for weighted Kaplan–Meier
Formulas for the Aalen-Johansen estimator
Competing risks (outcome.type = "competing-risk") arise
in studies in which individuals are exposed to two or more mutually
exclusive failure events. When a failure occurs, we observe the time to
event
and the cause of failure
.
Suppose that
and
represent the event of interest and the competing risk, respectively.
Let
be the number of failures of cause
at time
,
and now the total number of failures at
is
.
The Aalen-Johansen estimator of CIF for cause is as below.
Aalen-Johansen estimator
where is the overall survival function.
Two variance estimators of the Aalen-Johansen estimator are commonly used: one based on counting process theory (Aalen, 1978) and the other based on the delta method.
Aalen variance
Delta method variance
Variance based on influence functions
Under regularity conditions, Aalen-Johansen estimator can be expanded
as
and the process
converges weakly to a tight Gaussian process. Here
is the influence function, the contribution of
-th
observation to the Aalen-Johansen estimator, and may be written as
where and is the Martingale process of the total count and the count of cause of -th observation, respectively, and is the at-risk process. A consistent variance estimator for is .
Confidence interval options
Standard errors The default in
cifcurve() with weights=NULL is the Greenwood
SE when outcome.type="survival" and the delta SE when
outcome.type="competing-risk". The default in
cifcurve() with weights is the SE based on influence
functions. By default cifcurve() rescales the
Greenwood/Tsiatis quantities so that std.err is reported on
the probability scale; set report.survfit.std.err = TRUE to
return the conventional log-survival SEs from
survival::survfit().
Confidence intervals cifcurve()
constructs intervals on the probability scale using the requested
transformation:
"arcsine-square root"/"arcsin"/"a"
(default), "plain, "log",
"log-log", or "logit". Passing
"none"/"n" skips interval computation
entirely. The function exponentiates back to the probability scale,
clips bounds to [0, 1], and replaces undefined values with
NA so that interval endpoints remain well behaved in plots
and summaries.