Statistical Methods for Psychology

(Michael S) #1

You can calculate an intraclass correlation coefficient in a number of different ways,
depending on whether you treat judges as a fixed or random variable and whether judges
evaluate the same or different subjects. The classic reference for intraclass correlation is
Shrout and Fleiss (1979), who discuss several alternative approaches. I am going to discuss
only the most common approach here, one in which we consider our judges to be a random
sample of all judges we could have used and in which each judge rates the same set of sub-
jects once. (In what follows I am assuming that judges are rating “subjects,” but they could
be rating pictures, cars, or the livability of cities. Take the word “subject” as a generic term
for whatever is being rated.)
We will start by assuming that the data in Table 14.14 can be represented by the fol-
lowing model:


In this model stands for the effect of the ith judge, stands for the effect of the jth sub-
ject (person), is the interaction between the ith judge and the jth subject (the degree to
which the judge changes his or her rating system when confronted with that subject), and
stands for the error associated with that specific rating. Because each judge rates each
subject only once, it is not possible in this model to estimate and separately, but it
is necessary to keep them separate in the model.
If you look back to the previous chapter you will see that when we calculated a magnitude-
of-effect measure (which was essentially an r^2 -family measure), we took the variance esti-
mate for the effect in question (in this case differences among subjects) relative to the sum
of the estimates of the several sources of variance. That is precisely what we are going to
do here. We will let


If most of the variability in the data is due to differences between subjects, with only a
small amount due to differences between judges, the interaction of judges and subjects, and
error, then this ratio will be close to 1.00. If judges differ from one another in how high or
low they rate people in general, or if there is a judge by subject interaction (different judges
rate different people differently), or if there is a lot of error in the ratings, the denominator
will be substantially larger than the numerator and the ratio will be much less than 1.00.
To compute the intraclass correlation we are first going to run a Subjects 3 Judges
analysis of variance with Judges as a repeated measure. Because each judge rates each sub-
ject only once, there will not be an independent estimate of error, and we will have to use
the Judge 3 Subject interaction as the error term. From the summary table that results, we
will compute our estimate of the intraclass correlation as


where jrepresents the number of judges and nrepresents the number of subjects.
To illustrate this, I have run the analysis of variance on the data in Table14.14b, which
is the data set where I have deliberately built in some differences due to subjects and
judges. The summary table for this analysis follows.


Source df SS MS F
Between subjects 4 57.067 14.267
Within subjects 10 20.666 2.067
Judge 2 20.133 10.067 150.25
Judge ×Subjects 8 0.533 0.067
Total 14 77.733

Intraclass correlation=

MSSubjects 2 MSJ 3 S
MSSubjects 1 (j 2 1)MSJ 3 S 1 j(MSJudge 2 MSJ 3 S)>n

Intraclass correlation=s^2 p>(s^2 a1s^2 p1s^2 ap1s^2 e)

apij eij

eij


apij

ai pj

Xij=m1ai1pj1apij 1 eij

Section 14.10 Intraclass Correlation 497
Free download pdf