http://www.ck12.org Chapter 1. An Introduction to Analyzing Statistical Data
According to the drawing, Chebyshev’s Theorem states that at least 75% of the data is between 40.5 and 98.1. Well,
this probably doesn’t seem too significant in this example, because all of the data falls within that range. In a later
chapter we will learn a more informative rule about standard deviation, but the advantage of Chebyshev’s Theorem
is that it applies to any sampleorpopulation, no matter how it is distributed.
Lesson Summary
When examining a set of data, we also use descriptive statistics to provide information about how the data is
spread out. Therangeis a measure of the difference between the smallest and largest numbers in a data set. The
interquartile rangeis the difference between the upper and lower quartiles. A more informative measure of spread
is based on the mean. We can look at how individual points vary from the mean by subtracting the mean from the
data value. This is called thedeviation. Thestandard deviationis a measure of the “average” deviation for the
entire data set. Because the deviations always sum to zero, we find the standard deviation by adding thesquared
deviations. When we have the entire population, the sum of the squared deviations is divided by the population
size. This quantity is called thevariance. Taking the square root of the variance gives the standard deviation. For
a population, the standard deviation is notatedσ. Because a sample is prone to random variation (sampling error),
we adjust the sample standard deviation to make it a little larger by divided the squared deviations by one less than
the number of observations. The result of that division is the sample variance, and the square root of the sample
variance is the sample standard deviation, usually notated as s.Chebyshev’s Theoremgives us a information about
the minimum percentage of data that is within a certain number of standard deviations of the mean it applies to any
population or sample, regardless of how that data is distributed.
Points to Consider
- How do you determine which measure of spread best describes a particular data set?
- What information does the standard deviation tell us about the specific, real data being observed?
- What are the effects of outliers on the various measures of spread?
- How does altering the spread of a data set affect its visual representation(s)?
Review Questions
- Use the rainfall data from figure 1 to answer this question
a. Calculate and record the sample mean:
b. Complete the chart to calculate the standard deviation and the variance.