Applied Statistics and Probability for Engineers

(Chris Devlin) #1
12-2 HYPOTHESIS TESTS IN MULTIPLE LINEAR REGRESSION 431

relationship found is an appropriate model for predicting pull strength as a function of wire
length and die height. Further tests of model adequacy are required before we can be
comfortable using this model in practice.

Most multiple regression computer programs provide the test for significance of regression
in their output display. The middle portion of Table 12-4 is the Minitab output for this example.
Compare Tables 12-4 and 12-10 and note their equivalence apart from rounding. The P-value is
rounded to zero in the computer output.

R^2 and Adjusted R^2
We may also use the coefficient of multiple determinationR^2 as a global statistic to assess
the fit of the model. Computationally,

(12-21)

For the wire bond pull strength data, we find that R^2 SSRSST5990.77126105.9447
0.9811. Thus the model accounts for about 98% of the variability in the pull strength response
(refer to the Minitab output in Table 12-4). The R^2 statistic is somewhat problematic as a
measure of the quality of the fit for a multiple regression model because it always increases
when a variable is added to a model.
To illustrate, consider the model fit to wire bond pull strength data in Example 11-8. This
was a simple linear regression model with x 1 wire length as the regressor. The value of R^2
for this model is R^2 0.9640. Therefore, adding xydie height to the model increases R^2 by
0.98110.96400.0171, a very small amount. Since R^2 always increases when a regressor
is added, it can be difficult to judge whether the increase is telling us anything useful about the
new regressor. It is particularly hard to interpret a small increase, such as observed in the pull
strength data.
Many regression users prefer to use an adjustedR^2 statistic:

R^2 

SSR
SST
 1 

SSE
SST

Because is the error or residual mean square and is a constant, R^2 adjwill
only increase when a variable is added to the model if the new variable reduces the error mean
square. Note that for the multiple regression model for the pull strength data R^2 adj0.979 (see the
Minitab output in Table 12-4), whereas in Example 11-8 the adjusted R^2 for the one-variable
model is R^2 adj0.962. Therefore, we would conclude that adding x 2 die height to the model
does result in a meaningful reduction in unexplained variability in the response.

SSE 1 np 2 SST 1 n 12


Table 12-10 Test for Significance of Regression for Example 12-3
Source of Degrees of
Variation Sum of Squares Freedom Mean Square f 0 P-value
Regression 5990.7712 2 2995.3856 572.17 1.08E-19
Error or residual 115.1735 22 5.2352
Total 6105.9447 24

R^2 adj 1  (12-22)

SSE 1 np 2
SST 1 n 12

c 12 .qxd 5/20/02 9:32 M Page 431 RK UL 6 RK UL 6:Desktop Folder:TEMP WORK:MONTGOMERY:REVISES UPLO D CH114 FIN L:Quark Files:

Free download pdf