The diagonal line across the plot serves as a reference line as to what would be
expected if the predicted probabilities were uninformative. The area under this
reference diagonal is 0.5. The area under the ROC curve is 0.745.
A slightly more complicated but more general approach for creating the same plot is to
use theroctabcommand with thegraphoption. After running the logistic regression,
thepredictcommand can be used to create a new variable containing the predicted
probabilities (named PROB in the code below). Theroctabwith thegraphoption is
then used with the variable FRACTURE listed first as the true outcome and the newly
created variable PROB listed next as the test variable. The code follows:
logit fracture agecat head patellar flex weight, or
predict prob
roctab fracture prob, graph
Theroctabcommand can be used to create an ROC plot using any test variable
against a true outcome variable. The test variable does not necessarily have to contain
predicted probabilities from a logistic regression. In that sense, theroctabcommand
is more general than thelroccommand.
Conditional Logistic Regression
Conditional logistic regression is demonstrated with the MI dataset using theclogit
command. The MI dataset contains information from a case-control study in which
each case is matched with two controls. The model is stated as follows:
logit PðCHD¼ 1 jXÞ¼b 0 þb 1 SMKþb 2 SPBþb 3 ECGþ~
38
i¼ 1
giVi
Vi¼
(
1 ifith matched triplet
0 otherwise
i¼ 1 ; 2 ;...; 38
Open the datasetmi.dta.The code to run the conditional logistic regression in
Stata is:
clogit mi smk sbp ecg, strata (match)
Thestrata()option, with the variable MATCH in parentheses, identifies MATCH as
the stratified variable (i.e., the matching factor). The output follows:
Conditional (fixed-effects) logistic regression Number of obs ¼ 117
LR chi2 (3) ¼ 22.20
Prob>chi2 ¼ 0.0001
Log likelihood¼31.745464 Pseudo R2 ¼ 0.2591
mi Coef. Std. Err. z P>jzj [95% Conf. Interval]
smk .7290581 .5612569 1.30 0.194 .3709852 1.829101
sbp .0456419 .0152469 2.99 0.003 .0157586 .0755251
ecg 1.599263 .8534134 1.87 0.061 .0733967 3.271923
656 Appendix: Computer Programs for Logistic Regression