CK-12-Basic Probability and Statistics Concepts - A Full Course

(Marvins-Underground-K-12) #1

6.3. Standard Deviation of a Data Set http://www.ck12.org


If we consider the spread of the data away from the mean, which is measured using standard deviation, as being
a stepping process, then 1 step to the right or 1 step to the left is considered 1 standard deviation away from the
mean. 2 steps to the left or 2 steps to the right are considered 2 standard deviations away from the mean. Likewise,
3 steps to the left or 3 steps to the right are considered 3 standard deviations away from the mean. The standard
deviation of a data set is simply a value, and in relation to the stepping process, this value would represent the size
of your footstep as you move away from the mean. Once the value of the standard deviation has been calculated, it
is added to the mean for moving to the right and subtracted from the mean for moving to the left. If the value of the
yellow mean tile was 58, and the value of the standard deviation was 5, then you could put the resulting sums and
differences on the appropriate tiles.


For a normal distribution, 68% of the data values would be located within 1 standard deviation of the mean, which is
between 53 and 63. Also, 95% of the data values would be located within 2 standard deviations of the mean, which
is between 48 and 68. Finally, 99.7% of the data values would be located within 3 standard deviations of the mean,
which is between 43 and 73. The percentages mentioned here make up what statisticians refer to as the68-95-99.7
Rule. These percentages remain the same for all data that can be assumed to be normally distributed. The following
diagram represents the location of these values on a normal distribution curve.


Now that you understand the distribution of the data and exactly how it moves away from the mean, you are ready
to calculate the standard deviation of a data set. For the calculation steps to be organized, a table is used to record
the results for each step. The table will consist of 3 columns. The first column will contain the data and will be
labeledx. The second column will contain the differences between the data values and the mean of the data set. This
column will be labeled(x−x)for a sample and(x−μ)for a population. The final column will be labeled(x−x)^2
for a sample and(x−μ)^2 for a population, and it will contain the square of each of the values recorded in the second
column.


If we were to add the variations found in the second column of the table, the total would be 0. This result of 0 implies
that there is no variation between the data value and the mean. In other words, if we were conducting a survey of the

Free download pdf