Handbook of Psychology, Volume 4: Experimental Psychology

(Axel Boer) #1

378 Conditioning and Learning


Such dysfunctional behaviors may provide models of select
instances of human psychopathology.


Instrumental Contingencies and Schedules
of Reinforcement


There are four basic types of instrumental contingencies,
depending on whether the response either produces or elimi-
nates the outcome and whether the outcome is of positive or
negative hedonic value. Positive reinforcement(i.e., reward)
is a contingency in which responding produces an outcome
with the result that there is an increase in response
frequency—for example, when a rat’s lever press results in
food presentation, or a student’s studying before an exam
produces an Agrade.Punishmentis a contingency in which
responding results in the occurrence of an aversive outcome
with the result that there is a decrease in response fre-
quency—for example, when a child is scolded for reaching
into the cookie jar or a rat’s lever press produces foot shock.
Omission(or positive punishment) describes a situation in
which responding cancels or prevents the occurrence of a
positive outcome with the result that there is a decrease in re-
sponse frequency. Finally, escapeoravoidance conditioning
(also called negative reinforcement) is a contingency in
which responding leads to the termination of an ongoing or
prevention of an expected aversive stimulus with the result
that there is an increase in response frequency—for example,
if a rat’s lever presses cancel a scheduled shock. Both posi-
tive and negative reinforcement contingencies by definition
result in increased responding, whereas omission and punish-
ment-avoidance contingencies by definition lead to de-
creased responding. For various reasons, including obvious
ethical concerns, it is desirable whenever possible to use al-
ternatives to punishment for behavior modification. For this
reason and practical considerations, there has been an in-
creasing emphasis in the basic and applied research literature
on positive reinforcement; research on punishment and aver-
sive conditioning is not discussed here (for reviews, see
Ayres, 1998; Dinsmoor, 1998).
A reinforcement schedule is a rule for determining
whether a particular response by a subject will be reinforced
(Ferster & Skinner, 1957). There are two criteria that
have been widely studied: the number of responses emitted
since the last reinforced response (ratio schedules), and the
time since the last reinforced response (interval schedules).
Use of these criteria provide for four basic schedules of rein-
forcement, which depend on whether the contingency is fixed
or variable: fixed interval (FI), fixed ratio (FR), variable in-
terval (VI), and variable ratio (VR). Under an FI xschedule,
the first response after xseconds have elapsed since the last


reinforcement is reinforced. After reinforcement there is typ-
ically a pause in responding, which then begins, increasing
slowly, and about two-thirds of the way through the interval
increases to a high rate (Schneider, 1969). The temporal con-
trol evidenced by FI performance has led to extensive use
of these schedules in research on timing (e.g., the peak pro-
cedure;Roberts, 1981). With an FR x schedule, the xth
response is reinforced. After a postreinforcement pause, re-
sponding begins and generally continues at a high rate until
reinforcement. When xis large enough, responding may
cease entirely with FR schedules (ratio strain;Ferster &
Skinner, 1957). Under a VI xschedule, the first response after
yseconds have elapsed is reinforced, where yis a value sam-
pled from a distribution that has an average of xseconds.
Typically, VI schedules generate steady, moderate rates of re-
sponding (Catania & Reynolds, 1968). When a VR xsched-
ule is arranged, the yth response is reinforced, where yis
a value sampled from a distribution with an arithmetic mean
ofx. Variable ratio schedules maintain the highest overall
rates of responding of these four common schedule types,
even when rates of reinforcement are equated (e.g., Baum,
1993).
Reinforcement schedules have been a major focus of re-
search in instrumental conditioning (for review, see Zeiler,
1984). Representative questions include why VR schedules
maintain higher response rates than comparable VI schedules
(the answer seems to be that short interresponse times are re-
inforced under VR schedules; Cole, 1999), and whether
schedule effects are best understood in terms of momentary
changes in reinforcement probability or of the overall rela-
tionship between rates of responding and reinforcement (i.e.,
molecular vs. molar level of analysis; Baum, 1973). In addi-
tion, because of the stable, reliable behaviors they produce,
reinforcement schedules have been widely adopted for use in
related disciplines as baseline controls (e.g., behavioral phar-
macology, behavioral neuroscience).

Comparing Pavlovian and Instrumental Conditioning

Many of the phenomena identified in Pavlovian conditioning
have instrumental counterparts. For example, the basic rela-
tions of acquisition as a result of response-outcome pairings
and extinction as a result of nonreinforcement of the re-
sponse, as well as spontaneous recovery from extinction, are
found in instrumental conditioning (see Dickinson, 1980;
R. R. Miller & Balaz, 1981, for more detailed comparisons).
Blocking and overshadowing may be obtained for instrumen-
tal responses (St. Claire-Smith, 1979; B. A. Williams, 1982).
Stimulus generalization and discrimination characterize in-
strumental conditioning (Guttman & Kalish, 1956). Temporal
Free download pdf