Logistic Regression: A Self-learning Text, Third Edition (Statistics in the Health Sciences)

Model A: X¼(X 1 , X 2 , X 3 ) fully
parameterized but

Model B:X¼(X 1 ,X 2 ,X 3 ,X 4 ,X 5 ,X 6 )
“better fit” than Model A

EXAMPLE e.g.,

LR test▸

Model 1 :logit PðXÞ ¼aþbE vs: Model 3 :logit PðXÞ ¼aþbEþgVþdEV

8 >>> >>> < >>> >>> :

LR¼ 2 lnL^Model 1 ð 2 lnL^Model 3 Þ ¼ 55 : 051 51 : 355 ¼ 3 : 696 w^2 ð 2 dfÞunderH 0 :g¼d¼ 0 ðP> 0 : 10 Þn:s:

Fitted Model 3: logitP^ðXÞ¼âþbÊþ^gVþ^dEV; where â¼^0 :^8473 ;^b¼^1 :^6946 ; ^g¼ 1 : 2528 ;^d¼ 2 : 5055

Covariate pattern

Obs. risk

Pred. risk X 1 :E¼1, V¼ 1

p^ 1 ¼ 0 : 6 P^ðX 1 Þ¼ 0 : 6

X 2 :E¼0, V¼ 1

p^ 2 ¼ 0 : 4 P^ðX 2 Þ¼ 0 : 4

X 3 :E¼1, V¼ 0

p^ 3 ¼ 0 : 3 P^ðX 3 Þ¼ 0 : 3

X 4 :E¼0, V¼ 0

p^ 4 ¼ 0 : 7 P^ðX 4 Þ¼ 0 : 7

E=1, V=1: P(X 1 ) E=0, V=1: P(X 2 )

ˆ ˆ

E=1: some D=1,

E=0: some D=1,

Model 3: No perfect fit

= 0.6 ≠ 0 or 1 = 0.4 ≠ 0 or 1

some D= 0

Note, however, even if a model A is fully parameterized, there may be a larger model B con- taining covariates not originally considered in model A that provides “better fit” than model A.

For instance, although Model 1 is the largest model that can be defined when only binaryE is considered, it is not the largest model that can be defined when binaryVis also considered. Nevertheless, we can choose between Model 1 and Model 3 by performing a standard likelihood ratio (LR) test that compares the two models; the (nonsignificant) LR results (P>0.10) are shown at the left.

Focusing now on Model 3, we show the fitted model at the left. Note that these same esti- mated parameters would be obtained whether we input the data by individual subjects (40 datalines) or by using an events–trials format (4 datalines). However, using events–trials format, we lose the ability to identify which subjects become cases.

The predicted risks obtained from the fitted model for each covariate pattern are also shown at the left, together with their corresponding observed risks. Notice that these predicted risks are equal to their corresponding observed proportions computed from the observed (stratified) data.

Nevertheless, none of these predicted risks are either 0 or 1, so the fitted model does not per- fectly predict each subject’s observed outcome, which is either 0 or 1. This is not surprising, since some exposed subjects develop the dis- ease and some do not.

Presentation: II. Saturated vs. Fully Parameterized Models 309

Logistic Regression: A Self-learning Text, Third Edition (Statistics in the Health Sciences)

Get our desktop app

Company

Features

Documentation

Resources