Data Analysis with Microsoft Excel: Updated for Office 2007

(Tuis.) #1
Chapter 4 Describing Your Data 165

differences involved. In any case, you should not remove an observation
without good cause and documentation of what you did and why.
What constitutes an outlier? How large (or small) must a value be be-
fore it can be considered an outlier? One accepted defi nition depends on
the interquartile range (IQR; recall that the interquartile range is equal to
the difference between the third and fi rst quartiles).


  1. If a value is greater than the third quartile plus 1.5 3 IQR or less than
    the fi rst quartile minus 1.5 3 IQR, it’s a moderate outlier.

  2. If a value is greater than the third quartile plus 3 3 IQR or less than the
    fi rst quartile minus 3 3 IQR, it’s an extreme outlier.
    A diagram displaying the boundaries for moderate and extreme outliers
    is shown in Figure 4-24.


Figure 4-24
The range of
moderate
and extreme
outliers
Q1 (3 IQR)

moderate
outliers

Q1 (1.5 IQR)

moderate
outliers

Q1 Q3median Q3 + (1.5 IQR) Q3 + (3 IQR)

interquartile range (IQR)

extreme
outliers

extreme
outliers

For example, if the fi rst quartile equals 30 and the third quartile equals
80, the interquartile range is 50. Any value above 80 1 (1.5 3 50), or 155,
would be considered a moderate outlier. Any value above 80 1 150, or 230,
would be considered an extreme outlier. The lower ranges for outliers would
be calculated similarly.
This defi nition of the outlier plays an important role in constructing one
of the most useful tools of descriptive statistics—the boxplot.

Working with Boxplots


In this section, we’ll explore one of the more important tools of descriptive
statistics, the boxplot. You’ll learn about boxplots interactively with Excel,
and then you’ll apply what you’ve learned to the Albuquerque price data.
Free download pdf