Basic Statistics

(Barry) #1
CONSIDERATIONS IN SELECTING APPROPRIATE STATISTICS 57

Since the mean of all the fi;’s is equal to p, the sample mean x is called an
unbiased statistic for the parameter p. This is simply a way of saying that pft = p.
In a research situation we do not know the values of all the members of the popu-
lation and we take just one sample. Also, we do not know whether our sample mean
is greater than or less than the population mean, but we do know that if we always
sample with the same sample size, in the long run the mean of our sample means will
equal the population mean.
By examining the mean of all 16 variances, we can see that the mean of all the
sample variances equals the population variance. We can say that the sample variance
s2 is an unbiased estimate of the population variance, u2. This is the third general
principle.
Our estimate of the population variance may be too high or too low or close to
correct, but in repeated sampling, if we keep taking random samples, the estimates
average to cr’. The reason that n - 1 is used instead of n in the denominator of s2
is that it is desirable to have an unbiased statistic. If we had used n, we would have
a biased estimate. The mean of s2 from all the samples would be smaller than the
population variance.


5.4 CONSIDERATIONS IN SELECTING APPROPRIATE STATISTICS


Several statistics that can be used as measures of location and variability have been
presented in this chapter. What statistic should be used? In Section 5.4.1 we consider
such factors as the study objectives and the ease of comparison with other studies. In
Section 5.4.2 we consider the quality of the data. In Section 5.4.3 the importance of
matching the type of data to the statistics used is discussed in the context of Stevens’
system of measurements.


5.4.1 Relating Statistics and Study Objectives


The primary consideration in choosing statistics is that they must fit the major objec-
tives of the study. For example, if an investigator wishes to make statements about
the average yearly expenditures for medical care, the mean expenditure should be
computed. But if the hope is to decide how much a typical family spends, medians
should be considered.
Investigators should take into account the statistics used by other investigators.
If articles and books on a topic such as blood pressure include means and standard
deviations, serious consideration should be given to using these statistics in order
to simplify comparisons for the reader. Similarly, in grouping the data into class
intervals, if 5-mmHg intervals have been used by others, it will be easier to compare
histograms if the same intervals are used. One need not slavishly follow what others
have done, but some consideration should be given if it is desired that the results be
compared.
The choice of statistics should be consistent internally in a report or article. For
example, most investigators who report means also report standard deviations. Those
who choose medians are also more apt to report quartiles.

Free download pdf