Data Analysis with Microsoft Excel: Updated for Office 2007

(Tuis.) #1
Chapter 4 Describing Your Data 161

other out. In fact, the sum of the deviations between each sample value and
the sample mean equals zero, so the average deviation is also zero.
Instead of averaging the deviations, we’ll square each deviation (to make
it positive) and then sum those values and divide by the number of observa-
tions minus 1. This value, known as the variance, is represented by s^2. The
formula for calculating s^2 is

s^25

Sum of squared deviations
Number of observations 21

5

1

n 21 a

n

i 51

(^1) xi 2 x 22
One measure of variability, the standard deviation (represented by the sym-
bol s), is calculated by taking the square root of the variance. The complete
formula for the standard deviation s is
s 5
ã
a
n
i 51
(^1) xi 2 x 22
n 21
Why do we divide the total of the squared deviations by n – 1, rather than n?
Recall that the sum of the deviations is known to be zero, so given the fi rst
n – 1 deviations, we can always calculate the remaining deviation. This means
only n – 1 of the deviations can vary freely; the last value is constrained by
the values of the preceding deviations. This fi gure, n – 1, is known as the
degrees of freedom and is a value that will become more important in the
chapters that follow.
The standard deviation represents the typical deviation of values from the
average. A large value of s indicates a high degree of variability in the data.
High is a relative term, and we usually speak about high degrees of variability
only when comparing one distribution with another. Table 4-6 summarizes
the different functions supported by Excel to describe the variability of data.
Table 4-6 Formulas to calculate variability of values in data sets
Function Description
AVEDEV(array) Returns the average of the absolute value of the deviations in an array
or data range.
DEVSQ(array) Returns the sum of the squared deviations in an array or data range.
MAX(array) Returns the maximum value in an array or data range.
MIN(array) Returns the minimum value in an array or data range.
STDEV(array) Returns the standard deviation of the values in an array or data range.
VAR(array) Returns the variance of the values in an array or data range.
RANGEVALUE
(array)
Returns the range of the values in an array or range reference.
StatPlus required.

Free download pdf