Basic Statistics

(Barry) #1

170 REGRESSION AND CORRELATION


12.2.2 Interpreting the Regression Coefficients

The quantity b = .2903 is called the slope of the straight line. In the case of the
regression line, b is called the regression coeficient. This number is the change in Y
for a unit change in X. If X is increased by 1 lb, Y is increased by .2903 mmHg.
If X is increased by 20 lb, Y is increased by .2903(20) = 5.8 mmHg. Note that the
slope is positive, so a heavier weight tends to be associated with a higher systolic
blood pressure. If the slope coefficient were negative, increasing values of X would
tend to result in decreasing values of Y. If the slope were 0, the regression line would
be horizontal.
One difficulty in interpreting the value of the slope coefficient b is that it changes
if we change the units of X. For example, if X were measured in kilograms instead
of pounds, we would get a smaller value for the slope. Thus it is not obvious how
to evaluate the magnitude of the slope coefficient. One way of evaluating the slope
coefficient is to multiply b by x and to contrast this result with 7. If by is small
relative to Y, the magnitude of the effect of b in predicting Y will tend to be small.
The quantity a = 80.74 is called the intercept. It represents the value of Y when
X = 0. The magnitude of a is often difficult to interpret since in many regression
lines we do not have any values of X close to 0, and it is hard to know if the points
fit a straight line outside the range of the actual X values. Since no adult male could
have a weight of 0, the value of the intercept is not very useful in our example.


12.2.3 Plotting the Regression Line

The regression line may be plotted as any straight line is plotted, by calculating
the value of Y for several values of X. The values of X should be chosen spaced
sufficiently far apart so that small inaccuracies in plotting will not influence the
placement of the line too much. For example, for the regression line,


Y = 80.74 + .2903X


we can substitute X = 140,X = the mean 190.7, and X = 260 in the equation
for the regression line and obtain Y = 121.4, Y = 136.1, and Y = 156.2. The 10
original points as well as the regression line are shown in Figure 12.2. Note that the
regression line has only been drawn to include the range of the X values. We do not
know how a straight line might fit the data outside the range of the X values that we
have measured and plotted.
Note also that at x = 190.7, the height - of the line is Y = 136.1 = 7. This
is always the case. The least-squares regression line always goes through the point
(X, Y).


12.2.4 The Meaning of the Least-Squares Line

If for each value of X, we calculate Y, the point on the regression line for that X
value, it can be subtracted from the observed Y value to obtain Y - Y. The difference
Y - Y is called a residual. The residuals are the vertical distances of data points
Free download pdf