The Essentials of Biostatistics for Physicians, Nurses, and Clinicians

(Ann) #1
52 CHAPTER 4 Normal Distribution and Related Properties

This, however, is not exactly right, because S is a random quantity
and not the constant σ. However, for large n, the resulting distribution
is close to the standard normal. But this is not so when n is small.
Gosset, whose pen name was Student, had experiments involving small
n. In this case, Gosset was able to discover the exact distribution, and
a formal mathematical proof that he was correct was later derived by
R. A. Fisher. We will discuss Gosset ’ s t - distribution later in this chapter.
It is also true for any distribution with a fi nite variance that the
sample mean is an unbiased estimator of the population mean, and if
σ is the standard deviation for these observation, which we assume are
independent and come from the same distribution, then the standard
deviation of the sample mean is σ/ n. However, inference cannot be
exact unless we know the distribution of the sample mean, except for
the parameter μ. Again, σ is a nuisance parameter, and we will use


(^) nX()Sˆ−μ / to draw inference.
However, we no longer can assume that each observation has a
normal distribution. In Gosset ’ s case, as long as the observations were
independent and normally and identically distributed with mean μ and
standard deviation σ , nX()Sˆ−μ / would have the t - distribution with
n − 1 degrees of freedom. The “ degrees of freedom ” is the parameter
for the t - distribution, and as the degrees of freedom get larger, the
t - distribution comes closer to a standard normal distribution. But in our
current situation where the distribution for the observations may not be
normal, nX()Sˆ−μ / may not have a t - distribution either. Its exact
distribution depends on the distribution of the observations. So how do
we do the statistical inference?
The saving grace that allows approximate inference is the central
limit theorem, which states that under the conditions assumed in the
previous paragraph, as long as the distribution of the observations has
a moment slightly higher than 2 (sometimes called the 2 + δ moment), *
nX()Sˆ−μ / will approach the standard normal distribution as n gets
large. Figure 4.1 illustrates this.
So we see distributions with a variety of shapes, and all have very
different distributions when n = 2, and less different when n = 5, but
all very close to the shape of a normal distribution when n = 30.



  • Recall that the population mean is E ( X ). This is called the fi rst moment. E ( X 2 ) is called
    the second moment. The variance is E [ X − E ( X )]^2 = E ( X 2 ) − [ E ( X )]^2 , and is called the second
    central moment. The 2 + δ moment is then E( X 2 +^ δ^ ) with δ > 0.

Free download pdf