Logistic Regression: A Self-learning Text, Third Edition (Statistics in the Health Sciences)

(vip2019) #1
Note, however, that it is typically unusual to
obtain an AUC as high as 0.90, and if so, almost
all exposed subjects are cases and almost all
unexposed subjects are noncases (i.e., there is
nearly complete separation of data points).
When there is such “complete separation,” it
is impossible as well as unnecessary to fit a
logistic model to the data.

In this section, we return to the previously
asked question:

Suppose we pick a case and a noncase at random
from the subjects analyzed using a logistic regres-
sion model. Is the case or the noncase more likely
to have a higher predicted probability?

To answer this question precisely, we must use
the fitted model to compute the proportion of
total case/noncase pairs for which the pre-
dicted value for cases is at least as large as the
predicted value for noncases.

If this proportion is larger than 0.5, then the
answer is that the randomly chosen case will
likely have a higher predicted probability than
the randomly chosen noncase. Note that this is
what we would expect to occur if the model
provides at least minimal predictive power to
discriminate cases from noncases.

Moreover, the actual value of this proportion,
tells us much more, namely this proportion
gives the “Area under the ROC” (i.e., AUC),
which, as discussed in the previous section,
provides an overall measure of the model’s
ability to discriminate cases from noncases.

To illustrate the calculation of this proportion,
suppose there are 300 (i.e.,n) subjects in the
entire study, of which 100 (i.e.,n 1 ) are true
cases and 200 (i.e.,n 0 ) are true noncases.

However:


 Unusual to find AUC0.9
 If so, there is nearlycomplete
separation of data points



  • E NotE
    D n 1 0


NotD 0 n 0 dOR undefined


n 1 n 0

IV. Computing the Area
Under the ROC (AUC)


Study Subjects

Randomly
select
Case Control

P(Xcase) > P(Xnoncase)?

pd¼no:of pairs in which
^PðXcaseÞP^ðXnoncaseÞ
Total#case-control pairs


pd> 0 : 5 )^PðXcaseÞ>^PðXnoncaseÞ
for randomly chosen
case-control pair
ðexpect this result if model
discriminates cases from noncasesÞ

More important:


pd¼AUC

EXAMPLE
Example of AUC calculation:

n¼300 subjects
n 1 ¼100 true cases
n 0 ¼200 true noncases
EXAMPLE

358 10. Assessing Discriminatory Performance of a Binary Logistic Model

Free download pdf