Statistical Methods for Psychology

(Michael S) #1

Unbiasedness


Suppose we have a population for which we somehow know the mean (μ), say, the heights
of all basketball players in the NBA. If we were to draw one sample from that population
and calculate the sample mean ( ), we would expect to be reasonably close to μ, partic-
ularly if Nis large, because it is an estimator of μ. So if the average height in this population
is 7.0 9 ( 9 ), we would expect a sample of, say, 10 players to have an average height
of approximately 7.0 9 as well, although it probably would not be exactly equal to 7.0 9. (We
can write , where the symbol means “approximately equal.”) Now suppose we
draw another sample and obtain its mean ( ). (The subscript is used to differentiate the
means of successive samples. Thus, the mean of the 43rd sample, if we drew that many,
would be denoted by .) This mean would probably also be reasonably close to μ, but we
would not expect it to be exactly equal to μ or to. If we were to keep up this procedure
and draw sample means ad infinitum, we would find that the average of the sample means
would be precisely equal to μ. Thus, we say that the expected value (i.e., the long-range av-
erage of many, many samples) of the sample mean is equal to μ, the population mean that it
is estimating. An estimator whose expected valueequals the parameter to be estimated is
called an unbiased estimatorand that is a very important property for a statistic to possess.
Both the sample mean and the sample variance are unbiased estimators of their correspon-
ding parameters. (We use N– 1) as the denominator of the formula for the sample variance
precisely because we want to generate an unbiased estimate.) By and large, unbiased esti-
mators are like unbiased people—they are nicer to work with than biased ones.

Efficiency


Estimators are also characterized in terms of efficiency.Suppose that a population is sym-
metric: Thus, the values of the population mean and median are equal. Now suppose that
we want to estimate the mean of this population (or, alternatively, its median). If we drew
many samples and calculated their means, we would find that the means ( ) clustered rela-
tively closely around μ. The medians of the same samples, however, would cluster more
loosely around μ. This is so even though the median is also an unbiased estimator in this
situation because the expected value of the median in this case would also equal μ. The fact
that the sample means cluster more closely around μ than do the sample medians indicates
that the mean is more efficient as an estimator. (In fact, it is the most efficient estimator of μ.)
Because the mean is more likely to be closer to μ (i.e., a more accurate estimate) than the
median, it is a better statistic to use to estimate μ.
Although it should be obvious that efficiency is a relative term (a statistic is more or
less efficient than some other statistic), statements that such and such a statistic is “effi-
cient” should really be taken to mean that the statistic is more efficient than all other statis-
tics as an estimate of the parameter in question. Both the sample mean, as an estimate of μ,
and the sample variance, as an estimate of , are efficient estimators in that sense. The fact
that both the mean and the variance are unbiased and efficient is the major reason that they
play such an important role in statistics. These two statistics will form the basis for most of
the procedures discussed in the remainder of this book.

Resistance


The last property of an estimator to be considered concerns the degree to which the estima-
tor is influenced by the presence of outliers. Recall that the median is relatively uninflu-
enced by outliers, whereas the mean can drastically change with the inclusion of one or two
extreme scores. In a very real sense we can say that the median “resists” the influence of

s^2

X


X 1


X 43


X 2


X 1 L 7 L


m=7.0

X 1 X 1


46 Chapter 2 Describing and Exploring Data


expected value


unbiased
estimator


efficiency

Free download pdf