Introductory Biostatistics

(Chris Devlin) #1

as a metaphor. In criminal court, the accused is ‘‘presumed innocent’’ until
‘‘proved guilty beyond all reasonable doubt.’’ This framework of presumed
innocence has nothing whatsoever to do with anyone’s personal belief as to the
innocence or guilt of the defendant. Sometimes everybody, including the jury,
the judge, and even the defendant’s attorney, think the defendant is guilty. The
rules and procedures of the criminal court must be followed, however. There
may be a mistrial, or a hung jury, or the arresting o‰cer forgot to read the
defendant his or her rights. Any number of things can happen to save the guilty
from a conviction. On the other hand, an innocent defendant is sometimes
convicted by overwhelming circumstantial evidence. Criminal courts occasion-
ally make mistakes, sometimes releasing the guilty and sometimes convicting
the innocent. Statistical tests are like that. Sometimes, statistical significance is
attained when nothing is going on, and sometimes, no statistical significance
is attained when something very important is going on.
Just as in the courtroom, everyone would like statistical tests to make mis-
takes as infrequently as possible. Actually, the mistake rate of one of two pos-
sible mistakes made by statistical tests has usually been chosen (arbitrarily) to
be 5% or 1%. The kind of mistake referred to here is the mistake of attaining
statistical significance when there is actually nothing going on, just as the mis-
take of convicting the innocent in a trial by jury. This mistake is called a type I
mistake ortype I error. Statistical tests are often constructed so that type I
errors occur 5% or 1% of the time. There is no custom regarding the rate of
type II errors, however. A type II error is the mistake of not getting statistical
significance when there is something going on, just as the mistake of releasing
the guilty in a trial by jury. The rate of type II mistakes is dependent on several
factors. One of the factors ishow muchis going on, just as the severity of the
crime in a trial by jury. If there is a lot going on, one is less likely to make type
II errors. Another factor is the amount of variability (‘‘noise’’) there is in the
data, just as in the quality of evidence available in a trial by jury. A lot of
variability makes type II errors more likely. Yet another factor is the size of the
study, just as the amount of evidence in a trial by jury. There are more type II
errors in small studies than there are in large ones. Type II errors are rare in
really huge studies but quite common in small studies.
There is a very important, subtle aspect of statistical tests, based on the
aforementioned three things that make type II errors very improbable. Since
really huge studies virtually guarantee getting statistical significance if there is
even the slightest amount going on, such studies result in statistical significance
when theamountthat is going on is of no practical importance. In this case,
statistical significance is attained in the face of no practical significance. On the
other hand, small studies can result in statisticalnonsignificancewhen some-
thing of great practical importance is going on. The conclusion is that the
attainment of statistical significance in a study is just as a¤ected by extraneous
factors as it is by practical importance. It is essential to learn that statistical
significance is not synonymous with practical importance.


INTRODUCTION TO STATISTICAL TESTS OF SIGNIFICANCE 189
Free download pdf