Statistical Methods for Psychology

The Average Deviation

At first glance it would seem that if we want to measure how scores are dispersed around the mean (i.e., deviate from the mean), the most logical thing to do would be to obtain all the deviations (i.e., ) and average them. You might reasonably think that the more widely the scores are dispersed, the greater the deviations and therefore the greater the average of the deviations. However, common sense has led you astray here. If you calculate the deviations from the mean, some scores will be above the mean and have a positive deviation, whereas others will be below the mean and have negative deviations. In the end, the positive and negative deviations will balance each other out and the sum of the deviations will be zero. This will not get us very far.

The Mean Absolute Deviation

If you think about the difficulty in trying to get something useful out of the average of the deviations, you might well be led to suggest that we could solve the whole problem by tak- ing the absolute values of the deviations. (The absolute value of a number is the value of that number with any minus signs removed. The absolute value is indicated by vertical bars around the number, e.g., | 2 3| 5 3.) The suggestion to use absolute values makes sense because we want to know how much scores deviate from the mean without regard to whether they are above or below it. The measure suggested here is a perfectly legitimate one and even has a name: the mean absolute deviation (m.a.d.).The sum of the absolute deviations is divided by N(the number of scores) to yield an average (mean) deviation: m.a.d. For all its simplicity and intuitive appeal, the mean absolute deviation has not played an important role in statistical methods. Much more useful measures, the variance and the standard deviation, are normally used instead.

The Variance

The measure that we will consider in this section, the sample variance (s^2 ),represents a different approach to the problem of the deviations themselves averaging to zero. (When we are referring to the population variance,rather than the sample variance, we use [lowercase sigma squared] as the symbol.) In the case of the variance we take advantage of the fact that the square of a negative number is positive. Thus, we sum the squared deviations rather than the absolute deviations. Because we want an average, we next divide that sum by some function of N, the number of scores. Although you might reasonably expect that we would divide by N, we actually divide by (N 2 1). We use (N 2 1) as a divisor for the sample variance because, as we will see shortly, it leaves us with a sample variance that is a better estimate of the corresponding population variance. (The population variance is calculated by dividing the sum of the squared deviations, for each value in the population, by Nrather than (N– 1). However, we only rarely calculate a population variance; we al- most always estimate it from a sample variance.) If it is important to specify more precisely the variable to which refers, we can sub- script it with a letter representing the variable. Thus, if we denote the data in Set 4 as X, the variance could be denoted as. You could refer to , but long subscripts are usually awkward. In general, we label variables with simple letters like Xand Y. For our example, we can calculate the sample variances of Set 4 and Set 32 as follows:^10

s^2 X s^2 Set 4

s^2

Xi 2 X

40 Chapter 2 Describing and Exploring Data

(^10) In these calculations and others throughout the book, my answers may differ slightly from those that you obtain
for the same data. If so, the difference is most likely caused by rounding. If you repeat my calculations and arrive
at a similar, though different,answer, that is sufficient.
mean absolute
deviation (m.a.d.)
sample
variance (s^2 )
population
variance

Statistical Methods for Psychology

Get our desktop app

Company

Features

Documentation

Resources