Encyclopedia of Environmental Science and Engineering, Volume I and II

(Ben Green) #1

STATISTICAL METHODS FOR ENVIRONMENTAL SCIENCE 1129


are normally distributed with equal variance assumed for the
underlying population. In practice, it is often applied to vari-
ables of a more restricted range, and in some cases where the
observed values of a variable are inherently discontinuous.
However, when the assumptions of the test are violated, or
distribution information is unavailable, it may be safer to use
nonparametric tests, which do not depend on assumptions
about the shape of the underlying distribution. While non-
parametric tests are less powerful than parametric tests such
as the t-test, when the assumptions of the parametric tests
are met, and therefore will be less likely to reject the null
hypothesis, in practice they yield results close to the t-test
unless the assumptions of the t-test are seriously violated.
Nonparametric tests have been used in meteorological stud-
ies because of nonnormality in the distribution of rainfall
samples. (Decker and Schickedanz, 1967). For further dis-
cussions of hypothesis testing, see Hoel (1962) and Lehmann
(1959). Discussions of nonparametric tests may be found in
Pierce (1970) and Siegel (1956).

Analysis of Variance (ANOVA)

The t-test applies to the comparison of two means. The con-
cepts underlying the t-test may be generalized to the testing of
more than two means. The result is known as the analysis of
variance. Suppose that one has several samples. A number
of variances may be estimated. The variance of each sample
can be computed around the mean for the sample. The vari-
ance of the sample means around the grand mean of all the
scores gives another variance. Finally, one can ignore the
grouping of the data and complete the variance for all scores
around the grand mean. It can be shown that this “total” vari-
ance can be regarded as made up of two independent parts,
the variance of the scores about their sample means, and the
variance of these means about the grand mean. If all these
samples are indeed from the same population, then estimates
of the population variance obtained from within the individ-
ual groups will be approximately the same as that estimated
from the variance of sample means around the grand mean.
If, however, they come from populations which are normally
distributed and have the same standard deviations, but dif-
ferent means, then the variance estimated from the sample
means will exceed the variance are estimated from the within
sample estimates.
The formal test of the hypothesis is known as the F-test.
It is made by forming the F-ratio.

F=

MSE
MSE

(1)
(2)

(19)

Mean square estimates (MSE) are obtained from variance
estimates by division by the appropriate degrees of free-
dom. The mean square estimate in the numerator is that for
the hypothesis to be tested. The mean square estimate in
the denominator is the error estimate; it derives from some
source which is presumed to be affected by all sources of
variance which affect the numerator, except those arising

from the hypothesis under test. The two estimates must also
be independent of each other. In the example above, the
within group MSE is used as the error estimate; however,
this is often not the case for more complex experimental
designs. The appropriate error estimate must be determined
from examination of the particular experimental design, and
from considerations about the nature of the independent
variables whose effect is being tested; independent variables
whose values are fixed may require different error estimates
than in the case of independent variables whose values are
to be regarded as samples from a larger set. Determination
of degrees of freedom for analysis of variance goes beyond
the scope of this paper, but the basic principle is the same
as previously discussed; each parameter estimated from the
data (usually means, for (ANOVA) in computing an estima-
tor will reduce the degrees of freedom for that estimate.
The linear model for such an experiment is given by

Xij = μ + Gi + eij, (20)

Where Xij is a particular observation, μ is the mean, Gi is
the effect the Gth experimental condition and eij is the
error uniquely associated with that observation. The eij are
assumed to be independent random samples from normal
distributions with zero mean and the same variances. The
analysis of variance thus tests whether various components
making up a score are significantly different from zero.
More complicated components may be presumed. For
example, in the case of a two-way table, the assumed model
might be

Xijk = μ + Ri + Cj + Rcij + eijk (21)

In addition to having another condition, or main effect, there
is a term RCij which is associated with that particular combi-
nation of levels of the main effects. Such effects are known
as interaction effects.
Basic assumptions of the analysis of variance are nor-
mality and homogeneity of variance. The F-test however,
has been shown to be relatively “robust” as far as deviations
from the strict assumption of normality go. Violations of the
assumption of homogeneity of variance may be more seri-
ous. Tests have been developed which can be applied where
violations of this assumption are suspected. See Scheffé
(1959; ch.10) for further discussion of this problem.
Innumerable variations on the basic models are possible.
For a more detailed discussion, see Cochran and Cox (1957) or
Scheffé (1959). It should be noted, especially, that a significant
F-ratio does not assure that all the conditions which entered
into the comparison differ significantly from each other. To
determine which mean differences are significantly differ-
ent, additional tests must be made. The problem of multiple
comparisons among several means has been approached in
three main ways; Scheffé’s method for post-hoc comparisons;
Tukey’s gap test; and Duncan’s multiple range test. For further
discussion of such testing, see Kirk (1968).
Computational formulas for ANOVA can be found in
standard texts covering this topic. However, hand calculation

C019_004_r03.indd 1129C019_004_r03.indd 1129 11/18/2005 1:30:57 PM11/18/2005 1:30:57 PM

Free download pdf