to interpret the results. Although the finer points of
statistics are presented elsewhere in this book, it is
common sense that the only way to interpret what
you measure is to define this whole processbefore
the experiment starts.
Thinking carefully about what might actually
constitute an observed responsebeforeyou mea-
sure it removes at least one important source of
bias. That bias is the clinical trialist himself/
herself. There has been too little emphasis in recent
years on the fundamentals of end points, their
variability and how they are measured. Further-
more, the relationship between what is measured
and its clinical relevance is always debatable: the
tendency is to measure something that can be
measured, rather than something thatneeds valida-
tion as clinically relevant.Good examples include
rheumatological studies: counts of inflamed joints
before and after therapy may be reported, but do
not reveal whether the experimental treatment or
the corresponding placebo caused some of the
patients to recover the ability to write or others
the ability to walk (Chaput de Saintonge and Vere,
1982).
Most clinical trialists experience the urge, espe-
cially in early studies, to collect every piece of data
that they possibly can, before and after every drug
exposure. This urge comes from natural scientific
curiosity, as well as a proper ethical concern,
because the hazard associated with clinical trials
is never zero. It behooves us to maximize the
amount of information gained in return for the
risk that the patient takes for us, and for medicine
in general.
Consequently, large numbers of variables are
typically measured before and after drug (or
placebo) administration. These variables all exhibit
biological variation. Many of these variations have
familiar, unimodal, symmetrical distributions which
are supposed to resemble Gaussian (normal), Chi-
squared,f, binomial and so on, probability density
functions. An intrinsic property of biological vari-
ables is that when measured one hundred times,
then, on the average and if normally distributed,
5% of those measurements will be more than 2
standard deviations from the mean (there are corol-
laries for the other probability density functions).
This meets a typical, prospective ‘p< 0 :05, and
therefore it is significant’ mantra. It is also true
that if you measure one hundred different variables,
on two occasions only, before and after administra-
tion of the test material, then, on the average, 5% of
those variables are going to be significantly different
after treatment (this masquerades sometimes in
Table 9.1 Some example sources of bias in clinical trials
Poorly matched placebos subtle or obvious non-randomization of patients
Failure of double-blinding, for example when pharmacodynamic effects cannot be controlled
Prompting of prejudiced subjective responses non-uniform medical monitoring
Protocol amendments with unequal effects on treatment groups
Peculiarities of the study site itself (e.g. psychotropic drug effects in psychiatric institutions which fail
to predict effects in out-patients)
Differing medical definitions across languages, dialects or countries (e.g. ‘mania’)
CRF with leading questions, either toward or away from adverse event reporting
Informal, ‘break the blind’ games played at study sites
Selective rigor in collection and storage of biological samples
Selectively incomplete data sets for each patient
Inappropriate use of parametric or non-parametric statistical techniques
Failure to adequately define end-points prospectively, and retrospective ‘data dredging’
Acceptance of correlation as evidence of causation
Averaging of proportionate responses from non-homogenous treatment groups, also known as Simpson’s
paradox (see Spilker, 1991)
Unskeptically accepting anecdotal reports tendency to publish only positive results
CRF: case report form; the term ‘controlled’ is used in its technical sense (see Section 9.2 of this chapter).
9.3 PROSPECTIVE DEFINITIONS: THE ONLY WAY TO INTERPRET WHAT YOU MEASURE 103