Just as in the case of simple regression, the closeness of fit of the estimated
regression equation to the data is measured by the sum of squares of these
deviations:
(5.17)where nis the number of observations in the sample. The larger the sum of
squares, the less closely the estimated regression equation fits; the smaller
the sum of squares, the more closely it fits. Thus, it seems reasonable to
choose the values of a, b 1 , and b 2 that minimize the expression in equation
(5.17). These estimates are least squares estimates, as in the case of simple
regression.
Multiple Coefficient of Determination
In a previous section we described how the coefficient of determination can
be used to measure how well a simple regression equation fits the data.
When a multiple regression is calculated, the multiple coefficient of deter-
mination, rather than the simple coefficient of determination discussed pre-
viously, is used for this purpose. The multiple coefficient of determination
is defined as:
(5.18)(5.18)where Y
^
iis the value of the dependent variable that is predicted from the
regression equation, which means that R^2 measures the proportion of the
total variation in the dependent variable that is explained by the regression
equation. The positive square root of the multiple coefficient of determina-
tion is called the multiple correlation coefficient and is denoted by R. It,
too, is sometimes used to measure how well a multiple-regression equation
fits the data.
If there are only two independent variables in a multiple regression, as
in equation (5.16), a relatively simple way to compute the multiple coeffi-
cient of determination is:
RYYYYii
ini
in221
21=− 1−−==∑
∑
( ˆ)()(YYiiˆ)(Y abX bXi i i)
inin
−= −− −
= =∑ ∑
2
11 2 221 1