Basic Statistics

(Barry) #1
LINEAR REGRESSION: SINGLE SAMPLE 175

Here we used a t value from Table A.3, which corresponds to the area up to .975
in order to obtain 95% two-sided intervals. The d.f.’s are n - 2 = 10 - 2 = 8.
Then n - 2 must be used here because it was necessary to estimate two population
parameters of the regression line, a and 9. If we repeatedly took samples of size 10
and calculated the 95% confidence intervals, 95% of the intervals would include 4,
the population slope coefficient.
Occasionally, we need to compute confidence intervals for the population intercept
a. If the sample includes points that have X values close to 0 so that the regression
line does not extend far below the smallest X value, then these confidence intervals
can be interpreted safely. The standard error of a is given by


and the 95% confidence interval for the population intercept is

a & t[.975][se(a)]

where a t value with n - 2 d.f. is used. For the example with the 10 males, we have


se(a) = 5.24[1/10 + (190.7)2/7224.1]1/2 = 5.24(2.266) = 11.87


and the 95% confidence interval is computed from

80.74 & 2.306(11.87) = 80.74 f. 27.37


or
53.37 < a < 108.11

Note that for this example, there is no reason to interpret the confidence limits for
the population intercept; no weights are near 0. In general, caution should be used in
making inferences from the regression line below the smallest X value or above the
largest X value.
The confidence interval for the variance about the regression line follows the
procedure used in Chapter 9 for a single sample except that n - 1 is replaced by
n - 2. The values for chi-squared are taken from Table A.4 for 8 d.f. The formula is
given by
(n - 2+;.x (n - 21s;,x
x2 j.9751 < 4.x < x2[.025]


17.53 < 0;.2? 2.18


and entering the data for the 10 males, we have

g(27.49) 8(27.49)

or 12.55 < < 10.04
for the 95% confidence interval of the standard deviation of the residuals about the
regression line.


< 100.88. Taking the square root, we have 3.54 <

Free download pdf