AP Statistics 2017

(Marvins-Underground-K-12) #1

• It is sensitive to the spread . The greater the spread, the larger will be the standard deviation. For two
datasets with the same mean, the one with the larger standard deviation has more variability.
• It is independent of n . Because we are averaging squared distances from the mean, the standard
deviation will not get larger just because we add more terms.
example: Find the standard deviation of the following 6 numbers: 3, 4, 6, 6, 7, 10.


solution:

Note that the standard deviation, like the mean, is not resistant to extreme values. Because it depends
upon distances from the mean, it should be clear that extreme values will have a major impact on the
numerical value of the standard deviation. Note also that, in practice, you will never have to do the
calculation above by hand—you will rely on your calculator.


Interquartile Range


Although the standard deviation works well in situations where the mean works well (reasonably
symmetric distributions), we need a measure of spread that works well when a mean-based measure is
not appropriate. That measure is called the interquartile range.
Remember that the median of a distribution divides the distribution in two—it is the middle of the
distribution. The medians of the upper and lower halves of the distribution, not including the median itself
in either half, are called quartiles . The median of the lower half is called the lower quartile , or the first
quartile (which is the 25th percentile—Q1 on the calculator). The median of the upper half is called the
upper quartile , or the third quartile (which is in the 75th percentile—Q3 on the calculator). The median
itself can be thought of as the second quartile or Q2 (although we usually don’t).
The interquartile range (IQR) is the difference between Q3 and Q1. That is, IQR = Q3 – Q1. When
you do 1-Var Stats , the calculator will return Q1 and Q3 along with a lot of other stuff. You have to
compute the IQR from Q1 and Q3. Note that the IQR comprises the middle 50% of the data.


example: Find Q1, Q3, and the IQR for the following dataset: 5, 5, 6, 7, 8, 9, 11, 13, 17.
solution: Because the data are in order, and there is an odd number of values (9), the median is 8.
The bottom half of the data comprises 5, 5, 6, 7. The median of the bottom half is the average of
5 and 6, or 5.5 which is Q1. Similarly, Q3 is the median of the top half, which is the mean of 11
and 13, or 12. The IQR = 12 – 5.5 = 6.5.
example: Find the standard deviation and IQR for the number of home runs hit by Babe Ruth in
his major league career. The number of home runs was: 0, 4, 3, 2, 11, 29, 54, 59, 35, 41, 46,
25, 47, 60, 54, 46, 49, 46, 41, 34, 22, 6.
solution: We put these numbers into a TI-83/84 list and do 1-Var Stats on that list. The calculator
returns S x = 20.21 , Q1 = 11 , and Q 3 = 47 . Hence the IQR = Q3 – Q1 = 47 – 11 = 36.
The range of the distribution is the difference between the maximum and minimum scores in the
distribution. For the home run data, the range equals 60 – 0 = 60. Although this is sometimes used as a
measure of spread, it is not very useful because we are usually interested in how the data spread out from
the center of the distribution, not in just how far it is from the minimum to the maximum values.


Outliers

Free download pdf