Basic Statistics

(Barry) #1
COMPUTING ESTIMATES OF f(t), S(t), AND h(t)^21 1

using the original sample size n. Here, the following computations use the number
actually exposed to risk, not simply the number entering the study.
The next column is labeled q. The hat over the g is there to indicate that 4 is an
approximation. It estimates the proportion of patients who die in an interval given
that they are exposed to risk in that interval. It is computed from


6 = d/nexp


For example, for the first interval we compute 8139.5 = .203. For the second interval,
we have 15129.5 = .508.
The column labeled 1; is computed simply as
fj=1-.q


and is the proportion of patients who survive an interval given that they are exposed
to risk in the interval. For the first interval, 1; = 1 - .203 = ,797.
The survival function gives the proportion surviving up to the start of the interval.
In some texts, it is denoted by i)(t) rather than the S(t) used in this book. The sample
survival function, S(t), for the first interval is equal to 1 since all patients survive up
to the beginning of the first interval. For the remaining intervals, it is computed by
multiplying 1; by S(t), both from the preceding interval. For example, for S(t) for
the second interval we multiply .797 by 1.000 to obtain g(t) = .797 for the second
interval. For the third interval, we multiply .492 by .797 to obtain .392. For the fourth
interval, we compute .360(.392) = .141, and the last interval is .714(.141) = .101.
In other words, the chance of surviving to the start of a particular interval is equal
to the chance of surviving up to the start of the preceding interval times the chance
of surviving though the preceding interval. In graphing S(t) on the vertical axis, the
computed values are graphed above the start of the interval on the horizontal axis.
The sample death density, f(t), is estimated at the midpoint of each interval. For
each interval, we compute


where w is the width of the interval. In this example, w = .5. For example, for
the first interval, we compute 1(.203)/.5 = .406. For the second interval, we have
.797(.508)/.5 = 310. The other intervals proceed in a similar fashion.
Finally, the estimate of the sample hazard function, k(t), is also plotted at the mid-
point of each interval. The formula for this estimate is


d
h(t) =
wbexp - d/21
Since the hazard is computed at the midpoint of the interval, we subtract one-half
of the deaths from nexp. Here, we are assuming that the deaths occur in a uniform
fashion throughout the interval, so at the midpoint of the interval one-half of them
will have occurred. We multiply the denominator by the width of the interval to get
the proper rate. For example, for the first interval we have

= .451


(^8) - - 8
h(t) =
.5[39.5 - 8/21 .5[35.5]

Free download pdf