Multiple Linear Regression 47
multiple linear regression is referred to as the multiple coefficient of deter-
mination. We reproduce its definition from Chapter 2 here:
R =
SSR
SST
2
(3.12)
where SSR = sum of squares explained by the regression model
SST = total sum of squares
Following the initial assessment, one needs to verify the model by
determining its statistical significance. To do so, we compute the over-
all model’s significance and also the significance of the individual regres-
sion coefficients. The estimated regression errors play an important role as
well. If the standard deviation of the regression errors is found to be too
large, the fit could be improved by an alternative. The reason is that too
much of the variance of the dependent y is put into the residual variance
s^2. Some of this residual variance may, in fact, be the result of variation in
some independent variable not considered in the model so far. And a final
aspect is testing for the interaction of the independent variables that we
discuss in the next chapter.
Testing for the Significance of the Model
To test whether the entire model is significant, we consider two alternative
hypotheses. The first, our null hypothesis H 0 , states that all regression coeffi-
cients are equal to zero, which means that none of the independent variables
play any role. The alternative hypothesis H 1 , states that at least one coef-
ficient is different from zero. More formally,
H^00 :0β=β=^1 ...=βk=
H 1 :0β≠j for at least one jk∈{}1,2, ,...
In the case of a true null hypothesis, the linear model with the inde-
pendent variables we have chosen does not describe the behavior of the
dependent variable. To perform the test, we carry out an analysis of variance
(ANOVA) test. In this context, we compute the F-statistic defined by
=
−−
F k =
nk
SSR
SSE
1
MSR
MSE
(3.13)
where SSE = unexplained sum of squares