Handbook of Psychology, Volume 4: Experimental Psychology

(Axel Boer) #1
Instrumental Responding 385

6 : 1 delay ratio is greater than the 2 : 6 magnitude ratio). How-
ever, if the delays to both the small and large reinforcers are
increased by the same amount, then Equation 13.5 predicts a
reversal of preference. For example, if the delays are both in-
creased by 10 s, then predicted preference for the small rein-
forcer is only 33% (16 : 11 delay ratio is no longer enough to
compensate for the 2 : 6 magnitude ratio). Empirical support
for such preference reversals has been obtained in studies of
both human and nonhuman choice (Green & Snyderman,
1980; Kirby & Herrnstein, 1995). These data suggest that
the temporal discounting function—that is, the function that
relates the value of a reward to its delay—is not exponential,
as assumed by normative economic theory, but rather hyper-
bolic in form (Myerson & Green, 1995).


Choice Between Stimuli of Acquired Value


Concurrent Chains. A more complex procedure that has
been widely used in research on choice is concurrent chains,
which is a version of concurrent schedules in which re-
sponses are reinforced not by food but by stimuli that are
correlated with different schedules of food reinforcement. In
concurrent chains, subjects respond during a choice phase
(initial links) to obtain access to one of two reinforcement
schedules (terminal links). The stimuli that signal the onset of
the terminal links are analogous to Pavlovian CSs and are
often called conditioned reinforcers,as their potential to
reinforce initial-link responding derives from a history of
pairing with food. Conditioned reinforcement has been a
topic of long-standing interest because it is recognized that
many of the reinforcers that maintain human behavior (e.g.,
money) are not of inherent biological significance (see B. A.
Williams, 1994b, for review). Preference in the initial links of
concurrent chains is interpreted as a measure of the relative
value of the schedules signaled by the terminal links.
Herrnstein (1964) found that ratios of initial-link response
rates matched the ratios of reinforcement rates in the terminal
links, suggesting that the matching law might be extended to
concurrent chains. However, subsequent studies showed that
the overall duration of the initial and terminal links—the tem-
poral context of reinforcement—affected preference in ways
not predicted by the matching law. To account for these data,
Fantino (1969) proposed the delay-reduction hypothesis,
which states that the effectiveness of a terminal-link stimulus
as a conditioned reinforcer depends on the reduction in delay
to reinforcement signaled by the terminal link. According to
Fantino’s model, the value of a stimulus depends inversely on
the reinforcement context in which it occurs (i.e., value is
enhanced by a lean context, and vice versa). Fantino (1977)
showed that the delay-reduction hypothesis provided an


excellent qualitative account of preference in concurrent
chains. Moreover, there is considerable evidence for the gen-
erality of the temporal context effects predicted by the model,
as shown by the delay-reduction hypothesis’s having been
extended to a variety of different situations (see Fantino,
Preston, & Dunn, 1993, for a review).

Preference for Variability, Temporal Discounting, and
the Adjusting-Delay Procedure. Studies with pigeons and
rats have consistently found evidence of preference for vari-
ability in reinforcement delays: Subjects prefer a VI terminal
link in concurrent chains over an FI terminal link that pro-
vides the same average reinforcement rate. This implies that
animals are risk-prone when choosing between different rein-
forcement delays (e.g., Killeen, 1968). Interestingly, when
given a choice between a variable or fixed amount of food,
animals are often risk-averse, although this preference ap-
pears to be modulated by deprivation level as predicted by
risk-sensitive foraging theory from behavioral ecology (see
Kacelnik & Bateson, 1996, for a review). For example,
Caraco, Martindale, and Whittam (1980) found that juncos’
preference for a variable versus constant number of seeds in-
creased when food deprivation was greater.
Mazur (1984) introduced an adjusting-delay procedure
that has become widely used to study preference for variabil-
ity. His procedure is similar to concurrent chains in that the
subject chooses between two stimuli that are correlated with
different delays to reward, but the dependent variable is an
indifference point—a delay to reinforcement that is equally
preferred to a particular schedule. Mazur determined fixed-
delay indifference points for a series of variable-delay sched-
ules, and found that the following model (Equation 13.6)
gave an excellent account of his results:

V 


n

i 1

(13.6)

In Equation 13.6, Vis the conditioned valueof the stimulus
that signals a delay to reinforcement, d 1 , ..., dn, and Kis a
sensitivity parameter. Equation 13.6 is called the hyperbolic-
decay model because it assumes that the value of a delayed
reinforcer decreases according to a hyperbola (see Fig-
ure 13.4). The hyperbolic-decay model has become the lead-
ing behavioral model of temporal discounting, and has been
extensively applied to human choice between delayed re-
wards (e.g., Kirby, 1997).

General Models for Choice

Recently, several general models for choice have been pro-
posed. These models may be viewed as extensions of the

1




1 Kdi

1

n
Free download pdf