computer printout above. The first is the “St Dev of x ” (0.2772 in this example). This is the standard
error of the slope of the regression line, s (^) b , the estimate of the standard deviation of the slope (for
information, although you don’t need to know this, . The second standard error given is
the standard error of the residuals, the “s ” (s = 16.57) at the lower left corner of the table. This is the
estimate of the standard deviation of the residuals (again, although you don’t need to know this,
.
Outliers and Influential Observations
Some observations have an impact on correlation and regression. We defined an outlier when we were
dealing with one-variable data (remember the 1.5 [IQR] rule?). There is no analogous definition when
dealing with two-variable data, but it is the same basic idea: an outlier lies outside of the general pattern
of the data. An outlier can certainly influence a correlation and, depending on where it is located, may
also exert an influence on the slope of the regression line.
An influential observation is often an outlier in the x -direction. Its influence, if it doesn’t line up
with the rest of the data, is on the slope of the regression line. More generally, an influential observation
is a datapoint that exerts a strong influence on a measure.
example: Graphs I, II, and III are the same except for the point symbolized by the box in graphs II
and III. Graph I below has no outliers or influential points. Graph II has an outlier that is an
influential point that has an effect on the correlation. Graph III has an outlier that is an
influential point that has an effect on the regression slope. Compare the correlation coefficients
and regression lines for each graph. Note that the outlier in Graph II has some effect on the
slope and a significant effect on the correlation coefficient. The influential point in Graph III
has about the same effect on the correlation coefficient as the outlier in Graph II, but a major
influence on the slope of the regression line.
Transformations to Achieve Linearity
Until now, we have been concerned with data that can be modeled with a line. Of course, there are many
two-variable relationships that are nonlinear. The path of an object thrown in the air is parabolic