Chapter 8 Regression and Correlation 315
Suppose that x is years on the job and y is salary. Then the y intercept
(^1) x 502 is the salary for a person with zero years’ experience, the starting
salary. The slope is the change in salary per year of service. A person with
a salary above the line would have a positive residual, and a person with a
salary below the line would have a negative residual.
If the line trends downward so that y decreases when x increases, then
the slope is negative. For example, if x is age and y is price for used cars,
then the slope gives the drop in price per year of age. In this example, the
intercept is the price when new, and the residuals represent the difference
between the actual price and the predicted price. All other things being
equal, if the straight line is the correct model, a positive residual means a
car costs more than it should, and a negative residual means a car costs less
than it should (that is, it’s a bargain).
Fitting the Regression Line
When fi tting a line to data, you assume that the data follow the linear model:
y5a1bx1e
where a is the “true” intercept, b is the “true” slope, and e is an error term.
When you fi t the line, you’ll try to estimate a and b, but you can never know
them exactly. The estimates of a and b, we’ll label a and b. The predicted
values of y using these estimates, we’ll label y^, so that
y^ 5 a 1 bx
To get estimates for a and b, we use values of a and b that result in a
minimum value for the sum of squared residuals. In other words, if yi is an
observed value of y, we want values of a and b such that
Sum of squared residuals (^5) a
n
i 51
(^1) yi 2 y^i 22
positive
residual
Figure 8-1
A fitted
regression
line
negative
residual