134 Metastatistics for the Non-Bayesian Regression Runner
where y 1 ≡ N^11
∑
i;T= 1 yi, y 0 ≡
1
N 0
∑
i;T= 0 yi, the subscript 1 refers to the
treatment group, the subscript 0 refers to the control group, and so on.
Taken literally, (3.14) suggests that, on average, in repeated samples, the mean
foranypre-treatment variable should be the same. An auxiliary implication is that
if one ran the regression but also included pre-treatment variables, the estimate of
the effect of the treatment should not change. If it does change substantially, this
is evidence against the design, and a cause for concern.
To fix ideas, suppose the treatment under consideration is ECMO (which we
considered in section 3.1.1) and suppose a standard randomization scheme was
employed on a large sample of children. A standard procedure is to report the
averages for several variables. Table 3.1 is a hypothetical table.
Table 3.1 Hypothetical RCT on the efficacy of
ECMO pre-treatment values of key variables
(standard errors in parentheses)
Pre-treatment variable Treatment Control
Birth weight (grams) 3.26 3.21
(0.22) (0.23)
Age (days) 52 54
(13) (14)
Usually researchers report whether there are any “significant” differences
between the treatment and control group means. The intended purpose is to ensure
that the two groups satisfy aceteris paribuscondition: in ways we can observe,
are the two groups roughly the same? – this is sometimes referred to as “bal-
ance.” If sample sizes are large enough, more frequently than not the values in
the two columns will not be “significantly” different. It serves as a “check” that
the randomization achieved its intended purpose.
What variables should be included in this “check”? Presumably such a list does
not include hair color, although this, in principle, should be balanced as well.
The usual rule is to consider “pre-treatment variables which are predictive of the
outcome.” These may or may not be part of a proper “model” of infant death,
but are there to assure one that, if there is a large difference in the groups after
treatment, the researcher will not mistakenly attribute to the treatment what was
really a failure of theceteris paribuscondition.
Bayesians frequently point to a flaw in this argument:
My doubts were first crystallized in the summer of 1952 by Sir Ronald Fisher.
“What would you do,” I had asked, “if, drawing a Latin Square at random for
an experiment, you happened to draw a Knut Vik square?” Sir Ronald said he
thought we would draw again and that, ideally, a theory explicitly excluding
regular squares should be developed...