12-2NowandsoTherefore, this is an equivalent way to write the extra sum of squares.12-6.5 Ridge Regression (CD Only)Since multicollinearity primarily affects the stability of the regression coefficients, it would
seem that estimating these parameters by some method that is less sensitive to multicollinear-
ity than ordinary least squares would be helpful. Several methods have been suggested. One
alternative to ordinary least squares, ridge regression,can be useful in combating multi-
collinearity. In ridge regression, the parameter estimates are obtained from(S12-1)where 0 is a constant. Generally, values of in the interval 01 are appropriate.
The ridge estimator *() is not an unbiased estimator of ,as is the ordinary least squares
estimator .Thus, ridge regression seeks to find a set of regression coefficients that are more
“stable,” in the sense that they have a small mean square error. Since multicollinearity usually
results in ordinary least squares estimators that may have extremely large variances, ridge
regression is suitable for situations where the multicollinearity problem exists.
To obtain the ridge regression estimator from Equation S12-1, we must specify a value
for the constant . Generally, there is an “optimum” for any problem, but the simplest ap-
proach is to solve Equation S12-1 for several values of in the interval 01. Then a
plot of the value of *() is constructed. This display is called a ridge trace.The approximate
value for is chosen subjectively by inspection of the ridge trace. Typically, its value is cho-
sen to obtain stable parameter estimates. Generally, the variance of *() is a decreasing func-
tion of , while the squared bias [E(*())]^2 is an increasing function of . Choosing the
value of involves trading off these two properties of *().
Extensive practical discussions of the use of ridge regression are in Montgomery,
Peck, and Vining (2001) and Myers (1990). In addition, several other biased estimation
techniques have been proposed for dealing with multicollinearity. Many regression com-
puter packages incorporate ridge regression capability. SAS PROC REG will fit ridge
regression models.
To illustrate ridge regression, consider the data in Table S12-1, which shows the heat gen-
erated in calories per gram for a particular type of cement as a function of the quantities of four
additives (w 1 , w 2 , w 3 , and w 4 ). We wish to fit a multiple linear regression model to these data.
This is some very “classical” regression data, first analyzed by Anders Hald. Refer to
Montgomery, Peck, and Vining (2001) for sources and more details.ˆ* 1 2 1 ¿ 2 ^1 ¿ySSE 1 Reduced Model 2 SSE 1 Full Model 2SSTSSE 1 Full Model 2 3 SSTSSE 1 Reduced Model 24SSR 1 Extra 2 SSR 1 Full Model 2 SSR 1 Reduced Model 2SSTSSR 1 Reduced Model 2 SSE 1 Reduced Model 2SSTSSR 1 Full Model 2 SSE 1 Full Model 2PQ220 6234F.CD(12) 5/20/02 3:54 PM Page 2 RK UL 6 RK UL 6:Desktop Folder:TEMP WORK:MONTGOMERY:REVISES UPLO D CH114 FIN L:Quark F