As Carlson and Timm (1974) argued, a more appropriate way to compare the models is
to examine the hypotheses they test. These authors point out that Method III represents an
estimation of treatment effects when cell means are weighted equally, and is particularly
appropriate whenever we consider sample size to be independent of treatment conditions.
A convincing demonstration of this is presented in Overall, Spiegel, and Cohen (1975).
Carlson and Timm also showed that Method II produces estimates of treatment effects
when row and column means are weighted by the sample size, but only when no interac-
tion is present. When an interaction is present, simple estimates of row and column effects
cannot be made, and, in fact, the null hypotheses actually tested are very bizarre indeed
[see Carlson and Timm (1974) for a statement of the null hypotheses for Method II.] SPSS,
which once relied on a method similar to Method II, finally saw the light some years ago
and came around to using Method III as the default. They labeled this method “Unique SS”
because each effect is assigned only that portion of the variation that it uniquely explains.
An excellent discussion of the hypotheses tested by different approaches is presented in
Blair and Higgins (1978) and Blair (1978). As Cochran and Cox suggested, “the only com-
plete solution of the ‘missing data’ problem is not to have them” (p. 82).
There is a third method of computing sums of squares that at first seems particularly
bizarre. Just to make matters even more confusing than they need to be, this is the method
that SPSS and SAS refer to as “Type I SS,” or Method I,but which I will refer to as
hierarchical sums of squares,though it is sometimes referred to as sequential sums of
squares,which is the term that SPSS uses. The peculiar thing about this approach is that it
is dependent on the order in which you name your variables. Thus if you tell SAS or SPSS
to model (predict or account for) the dependent variable on the basis of A, B, and AB, the
program will first assign. Then , and
finally. In this situation the first effect is assigned
all of the sums of squares it can possibly account for. The next effect is assigned all that it
can account for over and abovewhat was accounted for by the first one. Finally, the inter-
action effect is assigned only what it accounts for over and above the two main effects. But,
if you ask the software to model the dependent variable on the basis of B, A, and AB, then
SSBwill equal , which is quite a different thing from.
The only time I could recommend using this approach is if you have a strong reason to
want to control the variables in a particular order.^3 If you can defend the argument that
Variable Ais so important that it should be looked at first without controlling for any other
variables, then perhaps this is a method you can use. But I have never seen a case where
I would want to do that, with the possible exception of dealing with a variable as a covari-
ate, which we will discuss shortly. The only reason that I bring the issue up here at all is to
explain some of the choices you will have to make in using computer software. (For a more
complete discussion of this issue go to my http://www.uvm.edu/~dhowell/StatPages/ and click
on New Material.)
Howell and McConaughy (1982) argued that there are very few instances in which one
would want to test the peculiar null hypotheses tested by Method II. The debate over the
“correct” model will probably continue for some time, mainly because no one model is
universally “correct,” and because there are philosophical differences in the approaches to
model specification [see Howell & McConaughy (1982) and Lewis & Keren (1977) versus
Appelbaum & Cramer (1974), O’Brien (1976), and Macnaughton (1992).] However, the
conclusion to be drawn from the literature at present is that for the most common situations
Method III is appropriate, since we usually want to test unweighted means. (This is the
SSregressionb SSregressiona,b 2 SSregressiona
SSAB=SSregressiona,b,ab 2 SSregressiona,b
SSA=SSregressiona SSB=SSregressiona,b 2 SSregressiona
Section 16.4 Analysis of Variance with Unequal Sample Sizes 595
(^3) There is a good and honorable tradition of prioritizing variables in this way for theoretical studies using standard
multiple regression with continuous variables. I have never seen a similar application in an analysis of variance
framework, though I have seen a number of people write about hypothetical examples.
Method I
hierarchical sums
of squares
sequential sums
of squares