Logistic Regression: A Self-learning Text, Third Edition (Statistics in the Health Sciences)

Study Subjects

Randomly select Case Control

P(Xcase) > P(Xnoncase)?

EXAMPLE Table 10.3: Collectively compare Se with 1Sp over all cut-points MODEL 1: cp 1.00 0.75 0.50 0.25 0.10 0.00 Se 0.00 0.10 0.60 1.00 1.00 1.00 >? No Yes Yes Yes Yes No 1–Sp 0.00 0.00 0.00 0.00 0.60 1.00 Good discrimination:Se>1–Spoverall

MODEL 2: cp 1.00 0.75 0.50 0.25 0.10 0.00 Se 0.00 0.10 0.60 0.80 0.90 1.00 >? No No No No No No 1–Sp 0.00 0.10 0.60 0.80 0.90 1.00 Poor discrimination:Senever>1–Sp (Here: Se¼ 1 Sp always)

Problem with using above info:
Se and 1Sp values aresummary
statisticsfor several subjects based
on a specific cut-point

Better approach:
Compute and comparepredicted
probabilities for specific pairs
of subjects

+ Obtained viaROC curves ðnext sectionÞ

Returning to Table 10.3,suppose we pick a case and a noncase at random from the subjects ana- lyzed in each model. Is the case or the noncase more likely to have a higher predicted probabil- ity?

Using Table 10.3, we can address this question by “collectively” comparing for each model, the proportion of true positives (Se) with the corresponding proportion of false positives (1Sp) over all cut-points considered.

For Model 1, we find that at each cut-point, the proportion of true positives is larger than the proportion of false positives at each cut-point except whencp¼1.00 or 0.00, at which both proportions are equal. These results suggest that Model 1 provides good discrimination since, overall, Se values are greater than 1 Sp values.

For Model 2, however, we find that at each cut- point, the proportion of true positives is identi- cal to the proportion of false positives at each cut-point. These results suggest that Model 2 does not provide good discrimination, since Se is never greater (although also never less) than 1Sp.

Nevertheless, the use of information from Table 10.3 is not the best way to compare predicted probabilities obtained from randomly selecting a case and noncase from the data. The reason: sensitivity and 1specificity values aresummary statisticsfor several subjects based on a specific cut-point; what is needed instead is to compute and comparepredicted probabilities for specific pairs of subjects. The use of ROC curves, which we describe in the next section, provides an appropriate way to quantify and compare such predicted probabilities.

354 10. Assessing Discriminatory Performance of a Binary Logistic Model

Logistic Regression: A Self-learning Text, Third Edition (Statistics in the Health Sciences)

Get our desktop app

Company

Features

Documentation

Resources