Inferential Statistics 365
for whatever the true parameter value θ may be. Hence, we stated the bias
as an ultimately preferable quality criterion. Yet, a bias of zero may be
too restrictive a criterion if an estimator θˆ is only slightly biased but has
a favorably small variance compared to all possible alternatives, biased
or unbiased. So, we need some quality criterion accounting for both bias
and variance.
That criterion can be satisfied by using the mean squared error (MSE).
Taking squares rather than the loss itself incurred by the deviation, the MSE
is defined as the expected square loss
MSE(θ=ˆ)[Eθ(θ−ˆ θ)]^2
where the subscript θ indicates that the mean depends on the true but
unknown parameter value. The mean squared error can be decomposed into
the variance of the estimator and a transform (i.e., square) of the bias. If the
transform is zero (i.e., the estimator is unbiased), the mean squared error
equals the estimator variance.
It is interesting to note that MSE-minimal estimators are not available
for all parameters. That is, we may have to face a trade-off between reduc-
ing either the bias or the variance over a set of possible estimators. As a
consequence, we simply try to find a minimum-variance estimator of all
unbiased estimators, which is called the minimum-variance unbiased esti-
mator. We do this because in many applications, unbiasedness has priority
over precision.
Large-Sample Criteria
The treatment of the estimators thus far has not included their possible
change in behavior as the sample size n varies. This is an important aspect
of estimation, however. For example, it is possible that an estimator that is
biased for any given finite n gradually loses its bias as n increases. Here we
will analyze the estimators as the sample size approaches infinity. In techni-
cal terms, we focus on the so-called large-sample or asymptotic properties
of estimators.
Consistency Some estimators display stochastic behavior that changes as
we increase the sample size. It may be that their exact distribution including
parameters is unknown as long as the number of draws n is small or, to be
precise, finite. This renders the evaluation of the quality of certain estimators
difficult. For example, it may be impossible to give the exact bias of some
estimator for finite n, in contrast to when n approaches infinity.