It may be, even in a controlled experiment, that the measured response is a function of variables
present in addition to the treatment variable. A confounding variable is one that has an effect on the
outcomes of the study but whose effects cannot be separated from those of the treatment variable. A
lurking variable is one that has an effect on the outcomes of the study but whose influence was not part of
the investigation. A lurking variable can be a confounding variable.
example: A study is conducted to see if Yummy Kibble dog food results in shinier coats on golden
retrievers. It’s possible that the dogs with shinier coats have them because they have owners
who are more conscientious in terms of grooming their pets. Both the dog food and the
conscientious owners could contribute to the shinier coats. The variables are confounded
because their effects cannot be separated.
A well-designed study attempts to anticipate confounding variables in advance and control for them.
Statistical control refers to a researcher holding constant variables not under study that might have an
influence on the outcomes.
example: You are going to study the effectiveness of SAT preparation courses on SAT score. You
know that better students tend to do well on SAT tests. You could control for the possible
confounding effect of academic quality by running your study with groups of “A” students, “B”
students, etc.
Control is often considered to be one of the three basic principles of experimental design. The other
two basic principles are randomization and replication .
The purpose of randomization is to equalize groups so that the effects of lurking variables are
equalized among groups. Randomization involves the use of chance (like a coin flip) to assign subjects to
treatment and control groups. The hope is that the groups being studied will differ systematically only in
the effects of the treatment variable. Although individuals within the groups may vary, the idea is to make
the groups as alike as possible except for the treatment variable. Note that it isn’t possible to produce,
with certainty, groups free of any lurking variables. It is possible, through the use of randomization, to
increase the probability of producing groups that are alike. The idea is to control for the effects of
variables you aren’t aware of but that might affect the response.
Replication involves repeating the experiment on enough subjects (or units) to reduce the effects of
chance variation on the outcomes. For example, we know that the number of boys and girls born in a year
are approximately equal. A small hospital with only 10 births a year is much more likely to vary
dramatically from 50% each than a large hospital with 500 births a year.
Completely Randomized Design
A completely randomized design for a study involves three essential elements: random allocation of
subjects to treatment and control groups; administration of different treatments to each randomized group
(in this sense we are calling a control group a “treatment”); and some sort of comparison of the outcomes
from the various groups. A standard diagram of this situation is the following:
There may be several different treatment groups (different levels of a new drug, for example), in