Introductory Biostatistics

(Chris Devlin) #1
Step 3:Variable ‘‘acid’’ is entered. Analysis of variables in the model is
shown in Table 9.12. None of the variables are removed. Analysis of var-
iables not in the model is shown in Table 9.13. No (additional) variables
meet the 0.1 level for entry into the model.

Note:An SAS program would include these instructions:

PROC LOGISTIC DESCENDING
DATA = CANCER;
MODEL NODES = X-RAY, GRADE, STAGE, AGE, ACID
/SELECTION = STEPWISE SLE = .10 SLS = .15 DETAILS;


where CANCER is the name assigned to the data set, NODES is the variable
name for nodal involvement, and X-RAY, GRADE, STAGE, AGE, and
ACID are the variable names assigned to the five covariates. The option
DETAILS provides step-by-step detailed results; without specifying it, we
would have only the final fitted model (which is just fine in practical applica-
tions). The default values for SLE (entry) and SLS (stay) probabilities are 0.05
and 0.10, respectively.


9.2.5 Receiver Operating Characteristic Curve


Screening tests, as presented in Chapter 1, were focused on binary test out-
come. However, it is often true that the result of the test, although dichoto-
mous, is based on the dichotomization of a continuous variable—say,X—
herein referred to as theseparator variable. Let us assume without loss of gen-
erality that smaller values ofX are associated with the diseased population,
often called thepopulation of the cases.Conversely, larger values of the sepa-
rator are assumed to be associated with the control or nondiseased population.
A test result is classified by choosing a cuto¤X¼xagainst which the
observation of the separator is compared. A test result is positive if the value
of the separator does not exceed the cuto¤; otherwise, the result is classified
as negative. Most diagnostic tests are imperfect instruments, in the sense that
healthy persons will occasionally be classified wrongly as being ill, while some
people who are really ill may fail to be detected as such. Therefore, there is
the ensuing conditional probability of the correct classification of a randomly
selected case, or the sensitivity of the test as defined in Section 1.1.2, which is
estimated by the proportion of cases withXax. Similarly, the conditional
probability of the correct classification of a randomly selected control, or the
specificity of the test as defined in Section 1.1.2, which can be estimated by the
proportion of controls withXbx.Areceiver operating characteristic(ROC)
curve, the trace of the sensitivity versusð 1 specificityÞof the test, is generated
as the cuto¤xmoves through its range of possible values. The ROC curve goes
from left bottom cornerð 0 ; 0 Þto right top cornerð 1 ; 1 Þas shown in Figure 9.2.
Being able to estimate the ROC curve, you would be able to do a number of
things.


336 LOGISTIC REGRESSION

Free download pdf