Introductory Biostatistics

(Chris Devlin) #1

horizontal axis, we have the figure called ascatter diagram, as seen in Chapter
2 and again in the next few examples. The scatter diagram is a useful diagnostic
tool for checking out the validity of features of the simple linear regression
model. For example, if dots fall around a curve, not a straight line, thelinearity
assumption may be violated. In addition, the model stipulates that for each
level ofX, the normal distribution forYhas constant variance not depending
on the value ofX. That would lead to a scatter diagram with dots spreading
out around the line evenly accross levels ofX. In most cases an appropriate
transformation, such as taking the logarithm ofYorX, would improve and
bring the data closer to fitting the model.


8.1.3 Meaning of Regression Parameters


The parametersb 0 andb 1 are calledregression coe‰cients. The parameterb 0
is the intercept of the regression line. If the scope of the model includesX¼0,
b 0 gives the mean ofYwhenX¼0; when the scope of the model does not
coverX¼0,b 0 does not have any particular meaning as a separate term in the
regression model. As for the meaning ofb 1 , our more important parameter, it
can be seen as follows. We first consider the case of a binary dependent variable
with the conventional coding


Xi¼

0 if the patient is not exposed
1 if the patient is exposed




Here the termexposedmay refer to a risk factor such as smoking or a patient’s
characteristic such as race (white/nonwhite) or gender (male/female). It can be
seen that:



  1. For a nonexposed subject (i.e.,X¼0)


my¼b 0


  1. For an exposed subject (i.e.,X¼1)


my¼b 0 þb 1

Hence,b 1 represents theincrease(ordecrease,ifb 1 is negative) in the mean of
Yassociated with the exposure. Similarly, we have for a continuous covariate
Xand any valuexofX:



  1. WhenX¼x,


my¼b 0 þb 1 x

284 CORRELATION AND REGRESSION

Free download pdf