Introductory Biostatistics

(Chris Devlin) #1

this chapter we deal with the first category and the statistical procedure called
estimation. It is extremely useful, one of the most useful procedures of statistics.
The wordestimateactually has a language problem, the opposite of the lan-
guage problem of statistical ‘‘tests’’ (the topic of Chapter 5). The colloquial
meaning of the wordtestcarries the implication that statistical tests are espe-
cially objective, no-nonsense procedures that reveal the truth. Conversely, the
colloquial meaning of the wordestimateis that of guessing, perhaps o¤ the top
of the head and uninformed, not to be taken too seriously. It is used by car
body repair shops, which ‘‘estimate’’ how much it will cost to fix a car after an
accident. The estimate in that case is actually a bid from a for-profit business
establishment seeking your trade. In our case, the wordestimationis used in the
usual sense that provides a ‘‘substitute’’ for an unknown truth, but it isn’t that
bad a choice of word once you understandhowto do it. But it is important to
make it clear that statistical estimation is no less objective than any other for-
mal statistical procedure; statistical estimation requires calculations and tables
just as statistical testing does. In addition, it is very important to di¤erentiate
formal statistical estimation from ordinary guessing. In formal statistical esti-
mation, we can determine theamount of uncertainty(and so the error) in the
estimate. How often have you heard of someone making a guess and then giv-
ing you a number measuring the ‘‘margin of error’’ of the guess? That’s what
statistical estimation does. It gives you the best guess and then tells you how
‘‘wrong’’ the guess could be, in quite precise terms. Certain media, sophisti-
cated newspapers in particular, are starting to educate the public about statis-
tical estimation. They do it when they report the results of polls. They say
things like, ‘‘74% of the voters disagree with the governor’s budget proposal,’’
and then go on to say that the margin error is plus or minus 3%. What they are
saying is that whoever conducted the poll is claiming to have polled about 1000
people chosen at random and that statistical estimation theory tells us to be
95% certain that ifallthe voters were polled, their disagreement percentage
would be discovered to be within 3% of 74%. In other words, it’s very unlikely
that the 74% is o¤ the mark by more than 3%; the truth is almost certainly
between 71 and 77%. In subsequent sections of this chapter we introduce the
strict interpretation of theseconfidence intervals.


4.1 BASIC CONCEPTS


A class of measurements or a characteristic on which individual observations
or measurements are made is called avariableorrandom variable. The value of
a random variable varies from subject to subject; examples include weight,
height, blood pressure, or the presence or absence of a certain habit or practice,
such as smoking or use of drugs. The distribution of a random variable is often
assumed to belong to a certain family of distributions, such as binomial, Pois-
son, or normal. This assumed family of distributions is specified or indexed by
one or several parameters, such as a population meanmor a population pro-


148 ESTIMATION OF PARAMETERS

Free download pdf