Logistic Regression: A Self-learning Text, Third Edition (Statistics in the Health Sciences)

Note, however, that it is typically unusual to obtain an AUC as high as 0.90, and if so, almost all exposed subjects are cases and almost all unexposed subjects are noncases (i.e., there is nearly complete separation of data points). When there is such “complete separation,” it is impossible as well as unnecessary to fit a logistic model to the data.

In this section, we return to the previously asked question:

Suppose we pick a case and a noncase at random from the subjects analyzed using a logistic regression model. Is the case or the noncase more likely to have a higher predicted probability?

To answer this question precisely, we must use the fitted model to compute the proportion of total case/noncase pairs for which the predicted value for cases is at least as large as the predicted value for noncases.

If this proportion is larger than 0.5, then the answer is that the randomly chosen case will likely have a higher predicted probability than the randomly chosen noncase. Note that this is what we would expect to occur if the model provides at least minimal predictive power to discriminate cases from noncases.

Moreover, the actual value of this proportion, tells us much more, namely this proportion gives the “Area under the ROC” (i.e., AUC), which, as discussed in the previous section, provides an overall measure of the model’s ability to discriminate cases from noncases.

To illustrate the calculation of this proportion, suppose there are 300 (i.e.,n) subjects in the entire study, of which 100 (i.e.,n 1 ) are true cases and 200 (i.e.,n 0 ) are true noncases.

However:

Unusual to find AUC0.9
If so, there is nearlycomplete
separation of data points

E NotE
D n 1 0

NotD 0 n 0 dOR undefined

n 1 n 0

IV. Computing the Area
Under the ROC (AUC)

Study Subjects

Randomly select Case Control

P(Xcase) > P(Xnoncase)?

pd¼no:of pairs in which
^PðXcaseÞP^ðXnoncaseÞ
Total#case-control pairs

pd> 0 : 5 )^PðXcaseÞ>^PðXnoncaseÞ for randomly chosen case-control pair ðexpect this result if model discriminates cases from noncasesÞ

More important:

pd¼AUC

EXAMPLE Example of AUC calculation:

n¼300 subjects n 1 ¼100 true cases n 0 ¼200 true noncases EXAMPLE

358 10. Assessing Discriminatory Performance of a Binary Logistic Model

Logistic Regression: A Self-learning Text, Third Edition (Statistics in the Health Sciences)

Get our desktop app

Company

Features

Documentation

Resources