Statistical Methods for Psychology

For our data on homophobia we have

This result expresses the difference between the two groups in standard deviation units, and tells us that the mean arousal for homophobic participants was nearly 2/3 of a standard deviation higher than the arousal of nonhomophobic participants. That strikes me as a big difference. (Using the software by Cumming and Finch (2001) we find that the confidence intervals on dare 0.1155 and 1.125, which is also rather wide. At the same time, even the lower limit on the confidence interval is meaningfully large.) Some words of caution. In the example of homophobia, the units of measurement were largely arbitrary, and a 7.5 difference had no intrinsic meaning to us. Thus it made more sense to express it in terms of standard deviations because we have at least some under- standing of what that means. However, there are many cases wherein the original units are meaningful, and in that case it may not make much sense to standardize the measure (i.e., report it in standard deviation units). We might prefer to specify the difference between means, or the ratio of means, or some similar statistic. The earlier example of the moon il- lusion is a case in point. There it is far more meaningful to speak of the horizon moon ap- pearing approximately half-again as large as the zenith moon, and I see no advantage, and some obfuscation, in converting to standardized units. The important goal is to give the reader an appreciation of the size of a difference, and you should choose that measure that best expresses this difference. In one case a standardized measure such as dis best, and in other cases other measures, such as the distance between the means, is better. The second word of caution applies to effect sizes taken from the literature. It has been known for some time (Sterling, 1959, Lane and Dunlap, 1978, and Brand, Bradley, Best, and Stoica, 2008) that if we base our estimates of effect size solely on the published literature, we are likely to overestimate effect sizes. This occurs because there is a definite tendency to pub- lish only statistically significant results, and thus those studies that did not have a significant effect are underrepresented in averaging effect sizes. For example, Lane and Dunlap (1978) ran a simple sampling study with the true effect size set at .25 and a difference between means of 4 points (standard deviation 5 16). With sample sizes set at n 15 n 25 15, they found an average difference between means of 13.21 when looking only at results that were statistically significant at a 5 .05. In addition they found that the sample standard deviations were notice- ably underestimated, which would result in a bias toward narrower confidence limits. We need to keep these findings in mind when looking at only published research studies. Finally, I should note that the increase in interest in using trimmed means and Winsorized variances in testing hypotheses carries over to the issue of effect sizes. Algina, Keselman, and Penfield (2005) have recently pointed out that measures such as Cohen’s dare often improved by use of these statistics. The same holds for confidence limits on the differences. As you will see in the next chapter, Cohen laid out some very general guidelines for what he considered small, medium, and large effect sizes. He characterized d 5 .20 as an effect that is small, but probably meaningful, an effect size of d 5 .50 as a medium effect that most people would be able to notice (such as a half of a standard deviation difference in IQ), and an effect size of d 5 .80 as large. We should not make too much of Cohen’s lev- els, but they are helpful as a rough guide.

Reporting results

Reporting results for a ttest on two independent samples is basically similar to reporting results for the case of dependent samples. In Adams et al.’s study of homophobia, two groups of participants were involved—one group scoring high on a scale of homophobia, and the

dN=

X 12 X 2

sp

=

24.00 2 16.50

12.02

=0.62

210 Chapter 7 Hypothesis Tests Applied to Means

Statistical Methods for Psychology

X 12 X 2

=

24.00 2 16.50

12.02

=0.62

Get our desktop app

Company

Features

Documentation

Resources