Just as in the case of simple regression, the closeness of fit of the estimated
regression equation to the data is measured by the sum of squares of these
deviations:
(5.17)
where nis the number of observations in the sample. The larger the sum of
squares, the less closely the estimated regression equation fits; the smaller
the sum of squares, the more closely it fits. Thus, it seems reasonable to
choose the values of a, b 1 , and b 2 that minimize the expression in equation
(5.17). These estimates are least squares estimates, as in the case of simple
regression.
Multiple Coefficient of Determination
In a previous section we described how the coefficient of determination can
be used to measure how well a simple regression equation fits the data.
When a multiple regression is calculated, the multiple coefficient of deter-
mination, rather than the simple coefficient of determination discussed pre-
viously, is used for this purpose. The multiple coefficient of determination
is defined as:
(5.18)
(5.18)
where Y
^
iis the value of the dependent variable that is predicted from the
regression equation, which means that R^2 measures the proportion of the
total variation in the dependent variable that is explained by the regression
equation. The positive square root of the multiple coefficient of determina-
tion is called the multiple correlation coefficient and is denoted by R. It,
too, is sometimes used to measure how well a multiple-regression equation
fits the data.
If there are only two independent variables in a multiple regression, as
in equation (5.16), a relatively simple way to compute the multiple coeffi-
cient of determination is:
R
YY
YY
ii
i
n
i
i
n
2
2
1
2
1
=− 1
−
−
=
=
∑
∑
( ˆ)
()
(YYiiˆ)(Y abX bXi i i)
i
n
i
n
−= −− −
= =
∑ ∑
2
11 2 2
2
1 1