AP Statistics 2017

(Marvins-Underground-K-12) #1

computer printout above. The first is the “St Dev of x ” (0.2772 in this example). This is the standard


error of the slope of the regression line, s (^) b , the estimate of the standard deviation of the slope (for
information, although you don’t need to know this, . The second standard error given is
the standard error of the residuals, the “s ” (s = 16.57) at the lower left corner of the table. This is the
estimate of the standard deviation of the residuals (again, although you don’t need to know this,

.


Outliers and Influential Observations


Some observations have an impact on correlation and regression. We defined an outlier when we were
dealing with one-variable data (remember the 1.5 [IQR] rule?). There is no analogous definition when
dealing with two-variable data, but it is the same basic idea: an outlier lies outside of the general pattern
of the data. An outlier can certainly influence a correlation and, depending on where it is located, may
also exert an influence on the slope of the regression line.
An influential observation is often an outlier in the x -direction. Its influence, if it doesn’t line up
with the rest of the data, is on the slope of the regression line. More generally, an influential observation
is a datapoint that exerts a strong influence on a measure.


example: Graphs I,  II, and III are the same    except  for the point   symbolized  by  the box in  graphs  II
and III. Graph I below has no outliers or influential points. Graph II has an outlier that is an
influential point that has an effect on the correlation. Graph III has an outlier that is an
influential point that has an effect on the regression slope. Compare the correlation coefficients
and regression lines for each graph. Note that the outlier in Graph II has some effect on the
slope and a significant effect on the correlation coefficient. The influential point in Graph III
has about the same effect on the correlation coefficient as the outlier in Graph II, but a major
influence on the slope of the regression line.

Transformations to Achieve Linearity


Until now, we have been concerned with data that can be modeled with a line. Of course, there are many
two-variable relationships that are nonlinear. The path of an object thrown in the air is parabolic

Free download pdf