A widely used GOF measure for many mathe-
matical models is called thedeviance. However,
as we describe later, for a binary logistic
regression model, the use of the deviance for
assessing GOF is problematic.
A popular alternative is the Hosmer–Lemeshow
(HL) statistic. Both the deviance and the HL
statistic will be defined and illustrated in this
chapter.
Widely used GOF measure:devi-
ance
But: deviance problematic for
binary logistic regression
Popular alternative:
Hosmer–Lemeshow (HL)
statistic
II. Saturated vs. Fully
Parameterized Models
GOF: Overall comparison between
observed and predicted out-
comes
Perfect fit:YiY^i¼ 0 for alli.
Rarely happens
Typically 0 <Y^i< 1 for mosti
Perfect fit:Not practical goal
Conceptual ideal
Saturated model
(a reference point)
As stated briefly in the previous overview sec-
tion, a measure of goodness of fit (GOF) pro-
vides an overall comparison between observed
(Yi) and predicted valuesðY^iÞof the outcome
variable.
We say there isperfect fitifYiY^i¼ 0 for alli.
Although the mathematical characteristics of
any logistic model require that the predicted
value for any subject must lie between or
including 0 or 1, it rarely happens that pre-
dicted values are either 0 or 1 forallsubjects
in a given dataset. In fact, the predicted values
for most, if not all, subjects will lie above 0 and
below 1.
Thus, achieving “perfect fit” is typically not a
practical goal, but rather is a conceptual ideal
when fitting a model to one’s data. Neverthe-
less, since we typically want to use this “ideal”
model as a reference point for assessing the fit
of any specific model of interest, it is conve-
nient to identify such a model as asaturated
model.
A trivial example of a saturated regression
model is obtained if we have a dataset contain-
ing onlyn¼2 subjects, as shown on the left.
Here, the outcome variable, SBP, is continu-
ous, as is the (nonsense) predictor variable foot
length (FOOT). A “perfect” straight line fits the
data.
EXAMPLE
n= 2
FOOT
SBP
9 11
- 115
170
Subject # SBP FOOT
(^11159)
2 170 11
Presentation: II. Saturated vs. Fully Parameterized Models 305