Statistical Methods for Psychology

population means and standard deviations). Others (Glass [1976] and Hedges [1981]) expressed their statistics ( and g, respectively) in terms of sample statistics, where Hedges used the pooled estimate of the population variance as the denominator (see Chapter 7 for the pooled estimate). The nice thing about any of these effect size measures is that they express the difference between means in terms of the size of a standard devia- tion. While it is nice to be correct, it is also nice, and sometimes clearer, to be consistent. As I have done elsewhere, I am going to continue to refer to our effect size measure as d, with apologies to Hedges and Glass. There is a direct relationship between the squared point-biserial correlation coefficient and d.

For our data on weights of males and females, we have

We can now conclude that the difference between the average weights of males and females is about 1 1/3 standard deviations. To me, that is more meaningful than saying that sex accounts for about 32% of the variation in weight.^2 An important point here is to see that these statistics are related in meaningful ways. We can go from to d, and vice versa, depending on which seems to be a more meaningful statistic. With the increased emphasis on the reporting of effect sizes and similar measures, it is important to recognize these relationships.

The Phi Coefficient (f)

The point-biserial correlation coefficient deals with the situation in which one of the variables is a dichotomy. When both variables are dichotomies, we will want a different statistic. For example, we might be interested in the relationship between gender and employment, where individuals are scored as either male or female and as employed or unemployed. Similarly we might be interested in the relationship between employment status (employed-unemployed) and whether an individual has been arrested for drunken driving. As a final example, we might wish to know the correlation between smoking (smokers versus nonsmokers) and death by cancer (versus death by other causes). Unless we are willing to make special assumptions concerning the underlying continuity of our variables, the most appropriate correlation coefficient is the f(phi) coefficient.This is the same that we considered briefly in Chapter 6.

Calculating f

Table 10.2 contains a small portion of the data from Gibson and Leitenberg (2000) (referred to in Exercise 6.33) on the relationship between sexual abuse training in school, (which some of you may remember as “stranger danger” or “good touch-bad touch”) and

f

r^2 pb

=

151.25 2 131.4

14.972

=1.33=

B

25(12 1 15)(-.565)^2

123 5(1 -.565^2 )

= 2 1.758=1.33

d=

X 12 X 2

spooled

=

B

df(n 11 n 2 )r^2 pb n 1 n 2 (1 2 r^2 pb)

d=

X 12 X 2

spooled

=

B

df(n 11 n 2 )r^2 pb n 1 n 2 (1 2 r^2 pb)

g¿

Section 10.1 Point-Biserial Correlation and Phi: Pearson Correlations by Another Name 299

(^2) If you then wish to calculate confidence limits on d, consult Kline (2004).
f (phi) coefficient

Statistical Methods for Psychology

=

151.25 2 131.4

14.972

=1.33=

B

25(12 1 15)(-.565)^2

123 5(1 -.565^2 )

= 2 1.758=1.33

X 12 X 2

=

B

X 12 X 2

=

B

Get our desktop app

Company

Features

Documentation

Resources