Section 9.15 Power Calculation for Pearson’s r 283
200
Weight
150
100
60
Height
Male
Female
65 70 75
Figure 9.9 Relationship between height and weight for males and females combined
(dashed line 5 female, solid line 5 male, dotted line 5 combined)
sample data from the Minitab manual (Ryan et al., 1985). These are actual data from 92 col-
lege students who were asked to report height, weight, gender, and several other variables.
(Keep in mind that these are self-report data, and there may be systematic reporting biases.)
When we combine the data from both males and females, the relationship is strikingly
good, with a correlation of .78. When you look at the data from the two genders separately,
however, the correlations fall to .60 for males and .49 for females. (Males and females have
been plotted using different symbols, with data from females primarily in the lower left.)
The important point is that the high correlation we found when we combined genders is
not due purely to the relation between height and weight. It is also due largely to the fact
that men are, on average, taller and heavier than women. In fact, a little doodling on a sheet
of paper will show that you could create artificial, and improbable, data where within each
gender’s weight is negatively related to height, while the relationship is positive when
you collapse across gender. (The regression equations for males is
and for females is .) The point I
am making here is that experimenters must be careful when they combine data from sev-
eral sources. The relationship between two variables may be obscured or enhanced by the
presence of a third variable. Such a finding is important in its own right.
A second example of heterogeneous subsamples that makes a similar point is the rela-
tionship between cholesterol consumption and cardiovascular disease in men and women.
If you collapse across both genders, the relationship is not impressive. But when you sepa-
rate the data by male and female, there is a distinct trend for cardiovascular disease to
increase with increased consumption of cholesterol. This relationship is obscured in the
combined data because men, regardless of cholesterol level, have an elevated level of
cardiovascular disease compared to women.
9.15 Power Calculation for Pearson’s r
Consider the problem of the individual who wishes to demonstrate a relationship between
television violence and aggressive behavior. Assume that he has surmounted all the very
real problems associated with designing this study and has devised a way to obtain a corre-
lation between the two variables. He believes that the correlation coefficient in the popula-
tion (r) is approximately .30. (This correlation may seem small, but it is impressive when
Heightmale 2 149.93 YNfemale=2.58 Heightfemale 2 44.86
YNmale=4.36