Logistic Regression: A Self-learning Text, Third Edition (Statistics in the Health Sciences)

(vip2019) #1
EXAMPLE
Covariate
pattern

Obs.
risk

Pred.
risk
X 1 :E¼1,
V¼ 1

^p 1 ¼ 0 : 6 P^ðX 1 Þ¼ 0 : 6

X 2 :E¼0,
V¼ 1

P^ 2 ¼ 0 : 4 P^ðX 2 Þ¼ 0 : 4

X 3 :E¼1,
V¼ 0

^p 3 ¼ 0 : 3 P^ðX 3 Þ¼ 0 : 3

X 4 :E¼0,
V¼ 0

^p 4 ¼ 0 : 7 P^ðX 4 Þ¼ 0 : 7

DevSSðb^Þ
¼ 2 ð 10 Þ½ 0 : 6 lnð 0 : 6 = 0 : 4 Þþlnð 0 : 4 Þ
þ 0 : 4 lnð 0 : 4 = 0 : 6 Þþlnð 0 : 6 Þ
þ 0 : 3 lnð 0 : 3 = 0 : 7 Þþlnð 0 : 7 Þ
þ 0 : 7 lnð 0 : 7 = 0 : 3 Þþlnð 0 : 3 ފ
¼ 51 : 3552

DevETð^bÞ¼ 0 : 06 ¼DevSSðb^Þ¼ 51 : 3552

DevETðb^Þ¼ 0 : 0 because
 2 lnL^ET saturated¼ 2 lnL^Model 3
¼ 2 lnL^C

so

DevETðb^Þ¼ 2 lnL^C
ð 2 lnL^ET saturatedÞ
¼ 2 lnL^Model 3
ð 2 lnL^Model 3 Þ
¼ 0 : 0
DevSSð^bÞ 6 ¼ 0 : 0 because
 2 lnL^SS saturated¼ 0 : 0
so

DevSSðb^Þ¼ 2 lnL^C
ð 2 lnL^SS saturatedÞ
¼ 2 lnL^Model 3  0
¼ 51 : 3552

Note: 2 lnL^C;ET 6 ¼ 2 lnL^C;SS


–2 ln LˆC,ET = –2 ln LˆC,SS −2K, where

ˆ


G
g=1 dg!(ng – dg)!

ng!
K = ln

Model 3: –2 ln LC,ET = 10.8168


K does not involve β, so
ˆβ is same for SS or ET

 2 lnL^C;SS¼ 51 : 3552 ;K¼ 20 : 2692

To compute this formula for Model 3, we need
to provide values of^PðXiÞfor each of the 40
subjects in the data set. Nevertheless, this cal-
culation can be simplified since there are only
four distinct values ofP^ðXiÞover all 40 sub-
jects. These correspond to the four covariate
patterns of Model 3.

The calculation now shown at the left, where
we have substituted each of the four distinct
values of^PðXiÞ10 times in the above formula.

We have thus seen that the events–trial and
subject-specific deviance values obtained for
Model 3 are numerically quite different.

The reason why DevETðb^Þis zero is because the
ET formula assumes that the (group-) saturated
model is the fully parameterized Model 3 and
the current model being considered is also
Model 3. So the values of their corresponding
log likelihood statistics ( 2 lnL^) are equal and
their difference is zero.

In contrast, the reason why DevSSðb^Þis differ-
ent from zero is because the SS formula
assumes that the saturated model is the “clas-
sical” (SS) saturated model that perfectly pre-
dicts each subject’s 0 or 1 outcome. As
mentioned earlier,  2 lnL^is always zero for
the SS saturated model. Thus, DevSSðb^Þsimpli-
fies to  2 lnL^C for the current model (i.e.,
Model 3), whose value is 51.3552.

Mathematically, the formula for 2 lnL^Cdiffers
with the (ET or SS) formatbeing used to specify
the data. However, these formulae differ by a
constantK, as we show on the left and illustrate
for Model 3. The formula forK, however, does
not involveb. Thus, the ML estimateb^will
be the same for either format. Consequently,
some computer packages (e.g., SAS) present
the same value (i.e.,  2 lnL^C;SS) regardless of
the data layout used.

316 9. Assessing Goodness of Fit for Logistic Regression

Free download pdf