difference, and I might have been tempted to use it in a research methods course that
I taught, dividing the students in the course into two groups and repeating Aronson’s study.
Of course, I would not be very happy if I tried out a demonstration experiment on my stu-
dents and found that it fell flat. I want to be sure that I have sufficient power to have a
decent probability of obtaining a statistically significant result in lab.
What Aronson actually found, which is trivially different from the sample data I gener-
ated in Chapter 7, were means of 9.58 and 6.55 for the Control and Threatened groups,
respectively. Their pooled standard deviation was approximately 3.10. We will assume that
Aronson’s estimates of the population means and standard deviation are essentially cor-
rect. (They almost certainly suffer from some random error, but they are the best guesses
that we have of those parameters.) This produces
My class has a lot of students, but only about 30 of them are males, and they are not
evenly distributed across the lab sections. Because of the way that I have chosen to run the
experiment, assume that I can expect that 18 males will be in the Control group and 12 in
the Threat group. Then we will calculate the effective sample size (the sample size to be
used in calculating d) as
We see that the effective sample sizeis less than the arithmetic mean of the two individual
sample sizes. In other words, this study has the same power as it would have had we run it
with 14.4 subjects per group for a total of 28.8 subjects. Or, to state it differently, with un-
equal sample sizes it takes 30 subjects to have the same power 28.8 subjects would have in
an experiment with equal sample sizes.
To continue,
For d52.63, power 5 .75 ata5.05 (two-tailed).
In this case the power is a bit too low to inspire confidence that the study will work out
as a lab exercise is supposed to. I could take a chance and run the study, but the lab might
fail and then I’d have to stammer out some excuse in class and hope that people believed
that it “really should have worked.” I’m not comfortable with that.
An alternative would be to recruit some more students. I will use the 30 males in my
course, but I can also find another 20 in another course who are willing to participate.
At the risk of teaching bad experimental design to my students by combining two different
classes (at least it gives me an excuse to mention that this could be a problem), I will add in
those students and expect to get sample sizes of 28 and 22.
These sample sizes would yield. Then
From Appendix Power we find that power now equals approximately .93, which is certainly
sufficient for our purposes.
=3.44
d=d
A
nh
2
=0.98
A
24.64
2
=0.98 1 12.32
nh=24.64
=2.63
d=d
B
nh
2
=0.98
A
14.4
2
=0.98 1 7.2
nh=
2(18)(12)
18112
=
432
30
=14.40
d=
m 1 2m 2
s
=
9.58 2 6.55
3.10
=
3.03
3.10
=0.98
Section 8.4 Power Calculations for Differences Between Two Independent Means 235
effective
sample size