CK-12 Probability and Statistics - Advanced

(Marvins-Underground-K-12) #1

1.3. Measures of Center http://www.ck12.org


Trimmed Mean


Remember that the mean is not resistant to the effects of outliers. Many students ask their teacher to “drop the lowest
grade.” The argument is that everyone has a bad day, and one extreme grade that is not typical of the rest of their work
should not have such a strong influence on their mean grade. The problem is that this can work both ways; it could
also be true that a student who is performing poorly most of the time could have a really good day (or even get lucky)
and get one extremely high grade. We wouldn’t blame this student for not asking the teacher to drop the highest
grade! Attempting to more accurately describe a data set by removing the extreme values is referred to astrimming
the data. To be fair though, a valid trimmed statistic must remove both the extreme maximum and minimum values.
So, while some students might disapprove, to calculate atrimmed mean, you remove the maximum and minimum
values and divide by the number of numbers that remain.


Let’s go back to Ron’s grades again:


75 , 80 , 90 , 94 , 96


A trimmed mean would remove the largest and smallest values, 75 and 96, and divide by 3.


^75 ZZ,^80 ,^90 ,^94 ,^96 ZZ


( 80 + 90 + 4


3


= 88


n% Trimmed Mean


Instead of removing just the minimum and maximums in a larger data set, a statistician may choose to remove a
certainpercentageof the extreme values. This is called ann%trimmed mean. To perform this calculation, you
would remove the specified percent of the number of values from the data, half on each end. For example, in a data
set that contained 100 numbers, if a researcher wanted to calculate a 10% trimmed mean, she would need to remove
10% of the data, or 5% from each end. In this simplified example, the five smallest and the five largest values would
be discarded and the sum of the remaining numbers would be divided by 90.


In “real” data, it is not always so straightforward. To illustrate this, let’s return to our data from the number of
children in a household and calculate a 10% trimmed mean. Here is the data set:


1 , 3 , 4 , 3 , 1 , 2 , 2 , 2 , 1 , 2 , 2 , 3 , 4 , 5 , 1 , 2 , 3 , 2 , 1 , 2 , 3 , 6


Placing the data in order yields:


1 , 1 , 1 , 1 , 1 , 2 , 2 , 2 , 2 , 2 , 2 , 2 , 2 , 3 , 3 , 3 , 3 , 3 , 4 , 4 , 5 , 6


With 22 values, 10% of them is 2.2, so we could remove 2 numbers, one from each end (2 total, or approximately 9%
trimmed), or we could remove 2 numbers from each end (4 total, or approximately 18% trimmed). Some statisticians
would calculate both of these and then use proportions to find an approximation for 10%. Others might argue that
9% is closer, so we should use that value. For our purposes, and to stay consistent with the way we handle similar
situations in later chapters, we will always opt to remove more numbers than necessary. The logic behind this is
simple. You are claiming to remove 10% of the numbers if we cannot remove exactly 10% then you either have to
remove more or less. We would prefer to err on the side of caution and removeat leastthe percentage reported.

Free download pdf