Logistic Regression: A Self-learning Text, Third Edition (Statistics in the Health Sciences)

EXAMPLE Covariate pattern

Obs. risk

Pred. risk X 1 :E¼1, V¼ 1

^p 1 ¼ 0 : 6 P^ðX 1 Þ¼ 0 : 6

X 2 :E¼0, V¼ 1

P^ 2 ¼ 0 : 4 P^ðX 2 Þ¼ 0 : 4

X 3 :E¼1, V¼ 0

^p 3 ¼ 0 : 3 P^ðX 3 Þ¼ 0 : 3

X 4 :E¼0, V¼ 0

^p 4 ¼ 0 : 7 P^ðX 4 Þ¼ 0 : 7

DevSSðb^Þ ¼ 2 ð 10 Þ½ 0 : 6 lnð 0 : 6 = 0 : 4 Þþlnð 0 : 4 Þ þ 0 : 4 lnð 0 : 4 = 0 : 6 Þþlnð 0 : 6 Þ þ 0 : 3 lnð 0 : 3 = 0 : 7 Þþlnð 0 : 7 Þ þ 0 : 7 lnð 0 : 7 = 0 : 3 Þþlnð 0 : 3 Þ ¼ 51 : 3552

DevETð^bÞ¼ 0 : 06 ¼DevSSðb^Þ¼ 51 : 3552

DevETðb^Þ¼ 0 : 0 because 2 lnL^ET saturated¼ 2 lnL^Model 3 ¼ 2 lnL^C

so

DevETðb^Þ¼ 2 lnL^C ð 2 lnL^ET saturatedÞ ¼ 2 lnL^Model 3 ð 2 lnL^Model 3 Þ ¼ 0 : 0 DevSSð^bÞ 6 ¼ 0 : 0 because 2 lnL^SS saturated¼ 0 : 0 so

DevSSðb^Þ¼ 2 lnL^C ð 2 lnL^SS saturatedÞ ¼ 2 lnL^Model 3 0 ¼ 51 : 3552

Note: 2 lnL^C;ET 6 ¼ 2 lnL^C;SS

–2 ln LˆC,ET = –2 ln LˆC,SS −2K, where

ˆ

∑

G g=1 dg!(ng – dg)!

ng! K = ln

Model 3: –2 ln LC,ET = 10.8168

K does not involve β, so ˆβ is same for SS or ET

2 lnL^C;SS¼ 51 : 3552 ;K¼ 20 : 2692

To compute this formula for Model 3, we need to provide values of^PðXiÞfor each of the 40 subjects in the data set. Nevertheless, this calculation can be simplified since there are only four distinct values ofP^ðXiÞover all 40 subjects. These correspond to the four covariate patterns of Model 3.

The calculation now shown at the left, where we have substituted each of the four distinct values of^PðXiÞ10 times in the above formula.

We have thus seen that the events–trial and subject-specific deviance values obtained for Model 3 are numerically quite different.

The reason why DevETðb^Þis zero is because the ET formula assumes that the (group-) saturated model is the fully parameterized Model 3 and the current model being considered is also Model 3. So the values of their corresponding log likelihood statistics ( 2 lnL^) are equal and their difference is zero.

In contrast, the reason why DevSSðb^Þis different from zero is because the SS formula assumes that the saturated model is the “clas- sical” (SS) saturated model that perfectly pre- dicts each subject’s 0 or 1 outcome. As mentioned earlier, 2 lnL^is always zero for the SS saturated model. Thus, DevSSðb^Þsimpli- fies to 2 lnL^C for the current model (i.e., Model 3), whose value is 51.3552.

Mathematically, the formula for 2 lnL^Cdiffers with the (ET or SS) formatbeing used to specify the data. However, these formulae differ by a constantK, as we show on the left and illustrate for Model 3. The formula forK, however, does not involveb. Thus, the ML estimateb^will be the same for either format. Consequently, some computer packages (e.g., SAS) present the same value (i.e., 2 lnL^C;SS) regardless of the data layout used.

316 9. Assessing Goodness of Fit for Logistic Regression

Logistic Regression: A Self-learning Text, Third Edition (Statistics in the Health Sciences)

Get our desktop app

Company

Features

Documentation

Resources