Encyclopedia of Environmental Science and Engineering, Volume I and II

1128 STATISTICAL METHODS FOR ENVIRONMENTAL SCIENCE

In situations where a two-way categorization of the data exists, the expected values may be estimated from the marginals. For example, the formula for chi-square for the fourfold contingency table shown below is

Classification II Classification I A B CD

x^2

2

NADBC

N

ABCD

⎛ ⎝⎜

⎞ ⎠⎟ ⋅⋅⋅

. (17)

Observe that instead of having independent expected values, we are now estimating these parameters from the marginal distributions of the data. The result is a loss in the degrees of freedom for the estimate. A chi-square with four independently obtained expected values would have four degrees of freedom; the fourfold table above has only one. The con- cept of degrees of freedom is a very general one in statistical analysis. It is related to the number of observations which can vary independently of each other. When expected values for chi-square are computed from the marginals, not all of the O E differences in a row or column are independent, for their discrepancies must sum to zero. Calculation of means from sample data imposes a similar restriction; since the deviations from the mean must sum to zero, not all of the observations in the sample can be regarded as freely varying. It is important to have the correct number of degrees of freedom for an estimate in order to determine the proper level of significance; many statistical tables require this information explicitly, and it is implicit in any comparison. Calculation of the proper degrees of freedom for a comparison can become complicated in spe- cific cases, especially that of analysis of variance. The basic principle to remember, however, is that any linear independent constraints placed on the data will reduce the degrees of freedom. Tables for value of the x 2 distribution for various degrees of freedom are readily available. For a further discussion of the use of chi-square, see Snedecor.

Difference between Two Samples

Another common situation arises when two samples are taken, and the experimenter wishes to know whether or not they are samples from populations with the same parameter values. If the populations can be presumed to be normal, then the significance of the differences of the two means can be tested by

t s N

s N

mmˆˆ 12

1

2

1

2

− (18)

where m^ 1 and m^ 2 are the sample means, s^21 and s^21 are the sample variances, N 1 and N 2 are the sample sizes. and the

population variances are assumed to be equal. This is the t -test, for two samples. The t -test can also be used to test the significance of the difference between one sample mean and a theoretical value. Tables for the significance of the t -test may be found in most statistical texts. The theory underlying the t -test is that the measures of dispersion estimated from the observations within a sample provide estimates of the expected variability. If the means are close together, relative to that variability, then it is unlikely that the populations differ in their true values. However, if the means vary widely, then it is unlikely that the samples come from distributions with the same underlying distributions. This situation is diagrammed in Figure 6. The t -test permits an exact statement of how unlikely the null hypothesis (assumption of no difference) is. If it is sufficiently unlikely, it can be rejected. It is common to assume the null hypothesis unless it can be rejected in at least 95% of the cases, though more stringent criteria (99% or more) may be adopted if more certainty is needed. The more stringent the criterion, of course, the more likely it is that the null hypothesis will be accepted when, in fact, it is false. The probability of falsely rejecting the null hypothesis is known as a type I error. Accepting the null hypothesis when it should be rejected is known as a type II error. For a given type I error, the probability of correctly rejecting the null hypothesis for a given true difference is known as the power of the test for detecting the difference. The function of these probabilities for various true differences in the parameter under test is known as the power function of the test. Statistical tests differ in their power and power functions are useful in the comparison of different tests. Note that type I and type II errors are necessarily related; for an experiment of a given level of precision, decreasing the probability of a type I error raises the probability of a type II error, and vice versa. Thus, increasing the stringency of one’s criterion does not decrease the overall probability of an erroneous conclusion; it merely changes the type of error which is most likely to be made. To decrease the overall error, the experiment must be made more precise, either by increasing the number of observations, or by reducing the error in the individual observations. Many other tests of mean difference exist besides the t-test. The appropriate choice of a test will depend on the assumptions made about the distribution underlying the observations. In theory, the t-test applies only for variables which are continuous, range from ± infinity in value, and

X (σ UNITS)

f(X)

m 1 m 2 m 3

FIGURE 6

C019_004_r03.indd 1128C019_004_r03.indd 1128 11/18/2005 1:30:56 PM11/18/2005 1:30:56 PM

Encyclopedia of Environmental Science and Engineering, Volume I and II

Get our desktop app

Company

Features

Documentation

Resources