Logistic Regression: A Self-learning Text, Third Edition (Statistics in the Health Sciences)

(vip2019) #1

Alternative (Events–Trials) Devi-
ance formula: for the logistic
model:
Gcovariate patterns
Xg¼ðXg 1 ;Xg 2 ;...;XgpÞ;g¼ 1 ; 2 ;...;G
d^g¼ngP^ðXgÞ¼expected cases
dg¼observed cases


DevETðb^Þ
¼ 2 lnL^c;ETð 2 lnL^max;ETÞ

¼ 2 ~


G

g¼ 1

dgln

dg
d^g

"!


þðngdgÞln

ngdg
ngd^g

!


;


where2lnL^c,ETand2lnL^max,ET
are defined using events–trials
(ET) format


First, we present an alternative formula for the
devianceinalogisticmodelthatconsiderstheco-
variate patterns defined by one’s current model.
We assume that this model contains G covariate
patternsXg¼(Xg 1 ,Xg 2 ,...,Xgp),withngsubjects
having pattern g. As defined earlier,d^g¼ngP^ðXgÞ
denotes the expected cases, where P^ðXgÞ is
the predicted risk forX¼Xg,anddgdenotes
the observed number of cases in subgroup g.

The alternative deviance formula is shown here
at the left. This formula corresponds to the data-
set listed in events–trials format, where there
areGdatalines,dgandngdenote the number
of events and number of trials, respectively, on
the gth dataline.

EXAMPLE
Model 3: logit P(X)¼aþbEþgV
þdEV

X: EV ng

Exp.
Cases

Obs.
Cases
X 1 :11 10 d^ 1 ¼ 6 d 1 ¼ 6
X 2 :01 10 d^ 2 ¼ 4 d 2 ¼ 4
X 3 :10 10 d^ 3 ¼ 3 d 3 ¼ 3
X 4 :00 10 d^ 4 ¼ 7 d 4 ¼ 7

DevETð^bÞfor Model 3:
¼ 26 ln
6
6


þ 4 ln
4
4



 24 ln
4
4


þ 6 ln
6
6



 23 ln^3
3


þ 7 ln^7
7



 27 ln
7
7


þ 3 ln
3
3



¼ 0

Deviance formula DevETðb^Þ uses
events–trials format


+
Units of analysis are groups
ðnot subjectsÞ

Recall that for the data set onn¼40 subjects
described above, Model 3 hasG¼4 covariate
patterns. For each pattern, the corresponding
values forng,d^g, anddgare shown at the left.

Substituting the values in the table into the
alternative (events–trials) deviance formula,
we find that the resulting deviance equals zero.

How can we explain this result, since we know
that the fully parameterized Model 3 is not
saturated in terms of perfectly predicting the
0 or 1 outcome for each of the 40 subjects in the
dataset? The answer is that since the ET devi-
ance formula corresponds to an events–trials
format, the units of analysis being considered
by the formula are the four groups of covariate
patterns rather than the 40 subjects.

314 9. Assessing Goodness of Fit for Logistic Regression

Free download pdf