Logistic Regression: A Self-learning Text, Third Edition (Statistics in the Health Sciences)

(vip2019) #1

Model A: X¼(X 1 , X 2 , X 3 ) fully
parameterized but


Model B:X¼(X 1 ,X 2 ,X 3 ,X 4 ,X 5 ,X 6 )
“better fit” than Model A


EXAMPLE
e.g.,

LR
test▸

Model 1 :logit PðXÞ
¼aþbE
vs:
Model 3 :logit PðXÞ
¼aþbEþgVþdEV

8
>>>
>>>
<
>>>
>>>
:

LR¼ 2 lnL^Model 1 ð 2 lnL^Model 3 Þ
¼ 55 : 051  51 : 355 ¼ 3 : 696
w^2 ð 2 dfÞunderH 0 :g¼d¼ 0
ðP> 0 : 10 Þn:s:

Fitted Model 3:
logitP^ðXÞ¼^aþb^Eþ^gVþ^dEV;
where ^a¼^0 :^8473 ;^b¼^1 :^6946 ;
^g¼ 1 : 2528 ;^d¼ 2 : 5055

Covariate
pattern

Obs.
risk

Pred.
risk
X 1 :E¼1,
V¼ 1

p^ 1 ¼ 0 : 6 P^ðX 1 Þ¼ 0 : 6

X 2 :E¼0,
V¼ 1

p^ 2 ¼ 0 : 4 P^ðX 2 Þ¼ 0 : 4

X 3 :E¼1,
V¼ 0

p^ 3 ¼ 0 : 3 P^ðX 3 Þ¼ 0 : 3

X 4 :E¼0,
V¼ 0

p^ 4 ¼ 0 : 7 P^ðX 4 Þ¼ 0 : 7

E=1, V=1: P(X 1 )
E=0, V=1: P(X 2 )

ˆ
ˆ


  • E=1: some D=1,


E=0: some D=1,


Model 3:
No perfect fit

= 0.6 ≠ 0 or 1
= 0.4 ≠ 0 or 1

some D= 0


some D= 0


Note, however, even if a model A is fully para-
meterized, there may be a larger model B con-
taining covariates not originally considered in
model A that provides “better fit” than model A.

For instance, although Model 1 is the largest
model that can be defined when only binaryE
is considered, it is not the largest model that
can be defined when binaryVis also consid-
ered. Nevertheless, we can choose between
Model 1 and Model 3 by performing a standard
likelihood ratio (LR) test that compares the
two models; the (nonsignificant) LR results
(P>0.10) are shown at the left.

Focusing now on Model 3, we show the fitted
model at the left. Note that these same esti-
mated parameters would be obtained whether
we input the data by individual subjects (40
datalines) or by using an events–trials format
(4 datalines). However, using events–trials
format, we lose the ability to identify which
subjects become cases.

The predicted risks obtained from the fitted
model for each covariate pattern are also
shown at the left, together with their cor-
responding observed risks. Notice that these
predicted risks are equal to their corresponding
observed proportions computed from the
observed (stratified) data.

Nevertheless, none of these predicted risks are
either 0 or 1, so the fitted model does not per-
fectly predict each subject’s observed outcome,
which is either 0 or 1. This is not surprising,
since some exposed subjects develop the dis-
ease and some do not.

Presentation: II. Saturated vs. Fully Parameterized Models 309
Free download pdf