Clinical Psychology

context of the nature of the variable (a brief state or
temporary syndrome vs. a long-standing personality
trait) as well as the length of the intervening time
period between test and retest. When test–retest reli-
ability is low, this may be due to a host of factors,
including subjects’tendency to report fewer symp-
toms at retest, subjects’boredom or fatigue at retest,
or the effect of variations in mood on the report of
symptoms (Sher & Trull, 1996). Table 6-4 describes
reliability indices for structured interviews.
Table 6-5 presents a hypothetical data set from a
study assessing the reliability of alcoholism diagnoses
derived from a structured interview. This example
assesses interrater reliability (the level of agreement
between two raters), but the calculations would be
the same if one wanted to assess test–retest reliability.
In that case, the data for Rater 2 would be replaced
by data for Testing 2 (Retest). As can be seen, the
two raters evaluated the same 100 patients for the
presence/absence of an alcoholism diagnosis, using
a structured interview. These two raters agreed in
90% of the cases [(30 60)/100]. Agreement here
refers to coming to the same conclusion—not just
agreeing that the diagnosis is present but also that
the diagnosis is absent. Table 6-5 also presents the
calculation for kappa—a chance-corrected index of
agreement that is typically lower than overall agree-
ment. The reason for this lower value is that raters
will agree on the basis of chance alone in situations
where the prevalence rate for a diagnosis is relatively
high or relatively low. In the example shown in
Table 6-5, we see that the diagnosis of alcoholism
is relatively infrequent.
Therefore, a rater who always judged the disor-
der to be absent would be correct (and likely to agree
with another rater) in many cases. The kappa

coefficient takes into account such instances of agreement based on chance alone and adjusts the agreement index (downward) accordingly. In general, a kappa value between .75 and 1.00 is considered to reflect excellent interrater agreement beyond chance (Cicchetti, 1994).

Validity

The validity of any type of psychological measure can take many forms.Content validityrefers to the measure’s comprehensiveness in assessing the variable of interest. In other words, does it adequately

T A B L E 6-5 Diagnostic Agreement Between Two Raters Rater 2 Present Absent Present 30 5 Rater 1 a b Absent 560 cd N 100 Overall Agreement a d/N .90

Kappa a dN a b a c c d b d N^2 1 a b a c c d b d N^2 ad bc ad bc Nb c 2 1775 2275 .78

T A B L E 6-4 Common Types of Reliability That Are Assessed to Evaluate Interviews

Type of Reliability Definition Statistical Index

Interrater or
interjudge reliability

Index of the degree of agreement between two or more raters or judges as to the level of a trait that is present or the presence/absence of a feature or diagnosis

Pearson’sr Intraclass correlation Kappa

Test–retest reliability Index of the consistency of interview scores across some
period of time

Pearson’sr Intraclass correlation

186 CHAPTER 6

Clinical Psychology

Validity

Get our desktop app

Company

Features

Documentation

Resources