Basic Statistics

(Barry) #1
COMPUTING ESTIMATES OF .f(t), S(t), AND h(t)^21 3

If the event and censoring occur at the same time as happened with day 3, Kaplan
and Meier recommend treating the event as if it occurred slightly before the censoring,
and the censoring is treated as if it occurred slightly after the particular event. An
alternative method of handling censored data is given by Hosmer et al. [2008].
The computations are summarized in Table 14.3. The column labeled days signifies
the day the event occurred, the column labeled death gives the number of patients
who died on a particular day, censored signifies lost to follow-up or withdrawn alive,
Tz&s is the number of patients observed on that day, (nabs - d)/Tz&s is a computed
quantity, and S(t) is the estimated survival function. In our example, S(t) = 1 up to
day 2, when the first patient died since all eight patients survived to day 2.
On day 2, one patient died, so the chance of surviving past 2days is 718, or .875.
(Tzobs - d)/n&s gives the proportion surviving at each particular time that a patient
dies given that the person has survived up to then. The chance of surviving up to day
2 is 1 and the chance of surviving past day 2 is .875 that the person has survival up to
day 2, so the chance of surviving past day 2 is S(2) = .875(1) = .875. Here, 2 has
been substituted for t in S(t) since we are giving the results for day 2.
On day 3, one patient dies and one is censored. Note that we assume that the death
occurs first. Thus, the patient dies from the seven remaining patients, so the chance
of surviving on day 3 given the patient survived up to day 3 is (7 - 1)/7 = .857.
Thus, the chance of surviving past day 3 is ,857 times the chance of surviving up to
day 3, or the chance of surviving past day 3 is .857(.875) = .750. There is no change
in S(t) when a patient is censored, so no calculations are required in the third row
except to reduce the number of patients observed by one to account for the patient
who is censored. Thus at the start of day 4, we have only five patients.
On day 4, two patients die, so the chance of dying on day 4 given the patient is
known to be alive up to day 4 is (5 - 2)/5 = .600. The chance of surviving past day
4 is S(4) = .600(.750) = ,450, where .750 is the chance of surviving past day 3.
The remaining rows are computed in a similar fashion. The general formula for
any given row can be given as (nabs - d)/n&, times the numerical value of S(t) for
the preceding row.
If we were to plot this data as a step function, we would first make a vertical axis
that goes from 0 to 1 and a horizontal axis that goes from 0 to 10. The values of
S(t) are plotted on the vertical axis and time in days is plotted on the horizontal axis.
Between day 0 and day 2, we would plot a horizontal line that had a height of 1. From
day 2 to day 3, another horizontal line would be plotted with a height of ,875, from
day 3 to day 4, another horizontal line would have a height of .750, and from day 4 to
day 7, a horizontal line with a height of .450 is plotted. On day 7 to day 9 the height
of the horizontal line would be .225, and after day 9 the height would be 0. Figure
14.7 illustrates the results obtained from Table 14.3.
For actual data sets with more times to the event, statistical programs are rec-
ommended both for the computations and for graphing the survival function, as the
Kaplan-Meier product limit method requires considerable work to calculate and graph
by hand. Note that SAS, SPSS, and Stata will provide plots of the results. These
programs will also print results from two or more groups on the same graph, so visual
comparisons can be made between groups.

Free download pdf