Logistic Regression: A Self-learning Text, Third Edition (Statistics in the Health Sciences)

(vip2019) #1

words, if a patient has a predicted probability greater than 0.100, then the patient
tests positive on the screening test. Notice that if a 0.100 cut point is used (see third
row), then of the 45 patients that really had a knee fracture, 36 of them are correctly
classified as events and 9 are incorrectly classified as nonevents yielding a sensitivity
of 0.8 or 80%.


To produce an ROC plot, first an output dataset must be created using the OUTROC¼
option in the MODEL statement of PROC LOGISTIC. This output dataset contains a
variable representing all the predicted probabilities as well as variables representing
the corresponding sensitivity and 1specificity. The code to create this output
dataset follows:


PROC LOGISTIC DATA¼REF .KNEEFR DESCENDING;
MODEL FRACTURE¼FLEX WEIGHT AGECAT HEAD PATELLAR/OUTROC¼CAT;
RUN;

The new dataset is called CAT (an arbitrary choice for the user). Using PROC PRINT,
the first ten observations from this dataset are printed as follows:


PROC PRINT DATA¼CAT (OBS¼10); RUN;

Obs PROB POS NEG FALPOS FALNEG SENSIT 1MSPEC


1 0.49218 2 303 0 43 0.04444 0.00000
2 0.43794 3 301 2 42 0.06667 0.00660
3 0.35727 6 298 5 39 0.13333 0.01650
4 0.34116 6 297 6 39 0.13333 0.01980
5 0.31491 8 296 7 37 0.17778 0.02310
6 0.30885 13 281 22 32 0.28889 0.07261
7 0.29393 16 271 32 29 0.35556 0.10561
8 0.24694 16 266 37 29 0.35556 0.12211
9 0.23400 16 264 39 29 0.35556 0.12871
10 0.22898 22 246 57 23 0.48889 0.18812


The variable PROB contains the predicted probabilities. The variables we wish to
plot are the last two, representing the sensitivity and 1specificity (called SENSIT
and 1MSPEC). PROC GPLOT can be used to produce a scatter plot in SAS, as
shown below. The statement PLOT Y*X will plot the variableYon the vertical axis
andXon the horizontal axis. The SYMBOL statement is used before PROC GPLOT to
set the plotting symbols as plus signs (VALUE¼PLUS) and to plot a cubic regression
to smooth the shape of the plot (INTERPOL¼RC). The code and plot follow:


SYMBOL VALUE¼PLUS INTERPOL¼RC;

PROC GPLOT DATA¼CAT;
PLOT_SENSIT_*_1MSPEC_;
RUN;

616 Appendix: Computer Programs for Logistic Regression

Free download pdf