CK-12 Probability and Statistics - Advanced

(Marvins-Underground-K-12) #1

1.3. Measures of Center http://www.ck12.org


Mean vs. Median


Both the mean and the median are important and widely used measures of center. So you might wonder why we
need them both. There is an important difference between them that can be explained by the following example.


Let’s say that you get an 85 and a 93 on your first two statistics quizzes, but then you had a really bad day and got a
14 on your next quiz!!!


The mean of your three grades would be a 64! What would the median be? Which is a better measure of your
performance? As you can see, the middle number in the set is an 85. That middle does not change if the lowest
grade is an 84, or if the lowest grade is a 14. However, when you add the three numbers to find the mean, the sum
will be much smaller if the lowest grade is a 14. If you divide a much smaller sum by 3, the mean will also be much
smaller.


Outliers and Resistance


So, why are the mean and median so different in this example? It is because there is one grade that is extremely
different from the rest of the data. In statistics, we call such extreme valuesoutliers. The mean is affected by the
presence of an outlier; however, the median is not. A statistic that is not affected by outliers is calledresistant. We
say that the medianisa resistant measure of center, and the mean is not resistant. In a sense, the median is able
toresistthe pull of a far away value, but the mean is drawn to such values. It cannotresistthe influence of outlier
values. Remember the balancing point example? If you created another number that was far away, you would be
forced to move the block toward it to make it stay balanced.


As a result, when we have a data set that contains an outlier, it is often better to use the median to describe the center,
rather than the mean. For example, in 2005 the CEO of Yahoo, Terry Semel, was paid almost $231 million,see
http://www.forbes.com/static/execpay2005/rank.html. This is certainly not typical of what the “average” worker at
Yahoo could expect to make. Instead of using the mean salary to describe how Yahoo pays its employees, it would
be more appropriate to use the median salary of all the employees. You will often see medians used to describe the
typical value of houses in a given area, as the presence of a very few extremely large and expensive homes could
make the mean appear misleadingly large.


Population Mean vs. Sample Mean


Now that we understand some basic concepts about the mean, it is important to be able to represent and understand
the mean symbolically. When you are calculating the mean as a statistic from a finite sample of data, we call this the
sample mean and as we have already mentioned, the symbol for this isX. Written symbolically then, the formula
for a sample mean is:

Free download pdf