A widely used GOF measure for many mathe-
matical models is called thedeviance. However,
as we describe later, for a binary logistic
regression model, the use of the deviance for
assessing GOF is problematic.A popular alternative is the Hosmer–Lemeshow
(HL) statistic. Both the deviance and the HL
statistic will be defined and illustrated in this
chapter.Widely used GOF measure:devi-
ance
But: deviance problematic for
binary logistic regression
Popular alternative:
Hosmer–Lemeshow (HL)
statistic
II. Saturated vs. Fully
Parameterized Models
GOF: Overall comparison between
observed and predicted out-
comes
Perfect fit:YiY^i¼ 0 for alli.
Rarely happens
Typically 0 <Y^i< 1 for mosti
Perfect fit:Not practical goal
Conceptual ideal
Saturated model
(a reference point)
As stated briefly in the previous overview sec-
tion, a measure of goodness of fit (GOF) pro-
vides an overall comparison between observed
(Yi) and predicted valuesðY^iÞof the outcome
variable.We say there isperfect fitifYiY^i¼ 0 for alli.
Although the mathematical characteristics of
any logistic model require that the predicted
value for any subject must lie between or
including 0 or 1, it rarely happens that pre-
dicted values are either 0 or 1 forallsubjects
in a given dataset. In fact, the predicted values
for most, if not all, subjects will lie above 0 and
below 1.Thus, achieving “perfect fit” is typically not a
practical goal, but rather is a conceptual ideal
when fitting a model to one’s data. Neverthe-
less, since we typically want to use this “ideal”
model as a reference point for assessing the fit
of any specific model of interest, it is conve-
nient to identify such a model as asaturated
model.A trivial example of a saturated regression
model is obtained if we have a dataset contain-
ing onlyn¼2 subjects, as shown on the left.
Here, the outcome variable, SBP, is continu-
ous, as is the (nonsense) predictor variable foot
length (FOOT). A “perfect” straight line fits the
data.EXAMPLEn= 2FOOTSBP9 11
- 115
170Subject # SBP FOOT(^11159)
2 170 11
Presentation: II. Saturated vs. Fully Parameterized Models 305