Encyclopedia of Environmental Science and Engineering, Volume I and II

1126 STATISTICAL METHODS FOR ENVIRONMENTAL SCIENCE

importance in environmental work. Others are encountered occasionally, such as the exponential distribution, which has been used to compute probabilities in connection with the expected failure rate of equipment. The distribution of times between occurrences of events in Poisson processes are described by the exponential distribution and it is important in the theory of such stochastic processes (Parzen, 1962). Further discussion of continuous distributions may be found in Freund (1962) or most other standard statistical texts. A special distribution problem often encountered in environmental work is concerned with the occurrence of extreme values of variables described by any one of several distributions. For example, in forecasting floods in connection with planning of construction, or droughts in connection with such problems as stream pollution, concern is with the most extreme values to be expected. To deal with such problems, the asymptotic theory of extreme values of a statistical variable has been developed. Special tables have been developed for estimating the expected extreme values for several distributions which are unlimited in the range of values which can be taken on by their extremes. Some information is also available for distributions with restricted ranges. An interest- ing application of this theory to prediction of the occurrence of unusually high tides may be found in Pfafflin (1970) and the Delta Commission Report (1960) Further discussion may be found in Gumbel.

HYPOTHESIS TESTING

Sampling Considerations

A basic consideration in the application of statistical procedures is the selection of the data. In parameter estimation and hypothesis testing sample data are used to make inferences to some larger population. The data are assumed to be a random sample from this population. By random we mean that the sample has been selected in such a way that the probability of obtaining any particular sample value is the same as its probability in the sampled population. When the data are taken care must be used to insure that the data are a random sample from the population of interest, and make sure that there must be no biases in the selec- tive process which would make the samples unrepresenta- tive. Otherwise, valid inferences cannot be made from the sample to the sampled population. The procedures necessary to insure that these conditions are met will depend in part upon the particular problem being studied. A basic principle, however, which applies in all experimental work is that of randomization. Randomization means that the sample is taken in such a way that any uncontrolled variables which might affect the results have an equal chance of affecting any of the samples. For example, in agri- cultural studies when plots of land are being selected, the assignment of different experimental conditions to the plots of land should be done randomly, by the use of a table of random numbers or some other randomizing process. Thus,

any differences which arise between the sample values as a result of differences in soil conditions will have an equal chance of affecting each of the samples. Randomization avoids error due to bias, but it does nothing about uncontrolled variability. Variability can be reduced by holding constant other parameters which may affect the experimental results. In a study comparing the smog-producing effects of natural and artificial light, other variables, such as temperature, chamber dilution, and so on, were held constant (Laity, 1971) Note, however, that such control also restricts generalization of the results to the conditions used in the test. Special sampling techniques may be used in some cases to reduce variability. For example, suppose that in an agricul- tural experiment, plots of land must be chosen from three different fields. These fields may then be incorporated explicitly into the design of the experiment and used as control variables. Comparisons of interest would be arranged so that they can be made within each field, if possible. It should be noted that the use of control variables is not a departure from randomization. Randomization should still be used in assigning conditions within levels of a control variable. Randomization is necessary to prevent bias from variables which are not explicitly controlled in the design of the experiment. Considerations of random sampling and the selection of appropriate control variables to increase precision of the experiment and insure a more accurate sample selection can arise in connection with all areas using statistical methods. They are particularly important in certain environmental areas, however. In human population studies great care must be taken in the sampling procedures to insure representative- ness of the samples. Simple random sampling techniques are seldom adequate and more complex procedures, have been developed. For further discussion of this kind of sampling, see Kish (1965) and Yates (1965). Sampling problems arise in connection with inferences from cloud seeding experiments which may affect the generality of the results (Bernier, 1967). Since most environmental experiments involve variables which are affected by a wise variety of other variables, sampling problems, especially the question of generalization from experimental results, is a very common problem. The specific randomization procedures, control variables and limitations on generalization of results will depend upon the particular field in question, but any experiment in this area should be designed with these problems in mind.

Parameter Estimation

A common problem encountered in environmental work is the estimation of population parameters from sample values. Examples of such estimation questions are: What is the “best” estimate of the mean of a population: Within what range of values can the mean safely be assumed to lie? In order to answer such questions, we must decide what is meant by a “best” estimate. Probably the most widely used method of estimation is that of maximum likelihood, developed by Fisher (1958). A maximum likelihood estimate is one which selects that parameter value for a distribution describing

C019_004_r03.indd 1126C019_004_r03.indd 1126 11/18/2005 1:30:56 PM11/18/2005 1:30:56 PM

Encyclopedia of Environmental Science and Engineering, Volume I and II

Get our desktop app

Company

Features

Documentation

Resources