Logistic Regression: A Self-learning Text, Third Edition (Statistics in the Health Sciences)

A widely used GOF measure for many mathematical models is called thedeviance. However, as we describe later, for a binary logistic regression model, the use of the deviance for assessing GOF is problematic.

A popular alternative is the Hosmer–Lemeshow (HL) statistic. Both the deviance and the HL statistic will be defined and illustrated in this chapter.

Widely used GOF measure:devi-
ance

But: deviance problematic for
binary logistic regression

Popular alternative:
Hosmer–Lemeshow (HL)
statistic

II. Saturated vs. Fully
Parameterized Models

GOF: Overall comparison between
observed and predicted out-
comes

Perfect fit:YiY^i¼ 0 for alli.

Rarely happens

Typically 0 <Y^i< 1 for mosti

Perfect fit:Not practical goal
Conceptual ideal
Saturated model
(a reference point)

As stated briefly in the previous overview sec- tion, a measure of goodness of fit (GOF) pro- vides an overall comparison between observed (Yi) and predicted valuesðY^iÞof the outcome variable.

We say there isperfect fitifYiY^i¼ 0 for alli. Although the mathematical characteristics of any logistic model require that the predicted value for any subject must lie between or including 0 or 1, it rarely happens that predicted values are either 0 or 1 forallsubjects in a given dataset. In fact, the predicted values for most, if not all, subjects will lie above 0 and below 1.

Thus, achieving “perfect fit” is typically not a practical goal, but rather is a conceptual ideal when fitting a model to one’s data. Neverthe- less, since we typically want to use this “ideal” model as a reference point for assessing the fit of any specific model of interest, it is conve- nient to identify such a model as asaturated model.

A trivial example of a saturated regression model is obtained if we have a dataset contain- ing onlyn¼2 subjects, as shown on the left. Here, the outcome variable, SBP, is continu- ous, as is the (nonsense) predictor variable foot length (FOOT). A “perfect” straight line fits the data.

EXAMPLE

n= 2

FOOT

SBP

9 11

115

170

Subject # SBP FOOT

(^11159)
2 170 11
Presentation: II. Saturated vs. Fully Parameterized Models 305

Logistic Regression: A Self-learning Text, Third Edition (Statistics in the Health Sciences)

Get our desktop app

Company

Features

Documentation

Resources