Logistic Regression: A Self-learning Text, Third Edition (Statistics in the Health Sciences)

(vip2019) #1
Study Subjects

Randomly
select
Case Control

P(Xcase) > P(Xnoncase)?

EXAMPLE
Table 10.3: Collectively compare
Se with 1Sp
over all cut-points
MODEL 1:
cp 1.00 0.75 0.50 0.25 0.10 0.00
Se 0.00 0.10 0.60 1.00 1.00 1.00
>? No Yes Yes Yes Yes No
1–Sp 0.00 0.00 0.00 0.00 0.60 1.00
Good discrimination:Se>1–Spoverall

MODEL 2:
cp 1.00 0.75 0.50 0.25 0.10 0.00
Se 0.00 0.10 0.60 0.80 0.90 1.00
>? No No No No No No
1–Sp 0.00 0.10 0.60 0.80 0.90 1.00
Poor discrimination:Senever>1–Sp
(Here: Se¼ 1 Sp always)

Problem with using above info:
Se and 1Sp values aresummary
statisticsfor several subjects based
on a specific cut-point


Better approach:
Compute and comparepredicted
probabilities for specific pairs
of subjects


+
Obtained viaROC curves
ðnext sectionÞ

Returning to Table 10.3,suppose we pick a case
and a noncase at random from the subjects ana-
lyzed in each model. Is the case or the noncase
more likely to have a higher predicted probabil-
ity?

Using Table 10.3, we can address this question
by “collectively” comparing for each model, the
proportion of true positives (Se) with the
corresponding proportion of false positives
(1Sp) over all cut-points considered.

For Model 1, we find that at each cut-point, the
proportion of true positives is larger than the
proportion of false positives at each cut-point
except whencp¼1.00 or 0.00, at which both
proportions are equal. These results suggest
that Model 1 provides good discrimination
since, overall, Se values are greater than
1 Sp values.

For Model 2, however, we find that at each cut-
point, the proportion of true positives is identi-
cal to the proportion of false positives at each
cut-point. These results suggest that Model
2 does not provide good discrimination, since
Se is never greater (although also never less)
than 1Sp.

Nevertheless, the use of information from
Table 10.3 is not the best way to compare pre-
dicted probabilities obtained from randomly
selecting a case and noncase from the data.
The reason: sensitivity and 1specificity
values aresummary statisticsfor several sub-
jects based on a specific cut-point; what is
needed instead is to compute and comparepre-
dicted probabilities for specific pairs of subjects.
The use of ROC curves, which we describe in
the next section, provides an appropriate
way to quantify and compare such predicted
probabilities.

354 10. Assessing Discriminatory Performance of a Binary Logistic Model

Free download pdf