Multiple Linear Regression 49
It is for this reason that a redefined version of the coefficient of deter-
mination is needed and is called the adjusted R-squared (or adjusted R^2 )
given by
(^) =−()− −
−−
RR
n
nk
11
1
adj 1
(^22) (3.14)
This adjusted goodness-of-fit measure incorporates the number of
observations, n, as well as the number of independent variables, k, plus the
constant term in the denominator (n − k − 1). For as long as the number of
observations is very large compared to k, R^2 and Radj^2 are approximately
the same.^7 However, if the number k of independent variables included
increases, the Radj^2 drops noticeably compared to the original R^2. One can
interpret this new measure of fit as penalizing excessive use of independent
variables. Instead, one should set up the model as parsimoniously as pos-
sible. To take most advantage of the set of possible independent variables,
one should consider those that contribute a maximum of explanatory varia-
tion to the regression. That is, one has to balance the cost of additional
independent variables and reduction in the adjusted R^2.
Testing for the Significance of the independent Variables
Suppose we have found that the model is significant. Now, we turn to the
test of significance for individual independent variables. Formally, for each
of the k independent variables, we test
H 0 :0β=j^ H 1 :0β≠j
conditional on the other independent variables already included in the
regression model.
The appropriate test would be the t-test, given by
(^) =
−
t
b
s
0
j
j
bj
(3.15)
(^7) For instance, inserting k = 1 into equation (3.14) we obtain
R
nRR
n
R
R
adj n
2
22
2
21 2
2
1
2
−+−
−
=−
−
−
()
(^) which, for large n, is only slightly less than R (^2).