Learning, Training and Behaviour 105
Basic principles of secondary reinforcement training
● The stimulus, e.g. a sound, is repeatedly paired with a primary reinforcer, e.g. a
food treat: sound first and then the food treat.
● The animal not only learns to anticipate the food as soon as it hears the sound
but, due to the strong association, the sound alone can act as a reward, but only
if it continues to be associated with the food treat.
● As soon as the animal performs a correct response, the trainer activates the sec-
ondary reinforcer (e.g. the sound), which ‘marks’ the behaviour and then the pri-
mary reinforcer is given.
Advantages of this training:
● When using a primary reinforcer there can be a delay between the performance of
the required behaviour and the delivery of the reward. This can result in the
desired behaviour being missed or another behaviour being unintentionally
rewarded and reinforced.
Box 7.2. Schedules of reinforcement.
Continuous reinforcement: the behaviour is rewarded every time it occurs.
Partial (intermittent) reinforcement: the behaviour is rewarded intermittently.
Ratio schedules
Fixed ratio: the subject is only rewarded if he has responded correctly a set number
of times.
Variable ratio: the number of correct responses varies between rewards.
Interval schedules
There is a time interval between rewards.
Fixed interval: the subject is only rewarded for a correct response that occurs a set
time after the last correct response.
Variable interval: the time between each reward for a correct response varies.
What is best for training?
Continuous reinforcement is advisable at the start of training to allow an initial strong
association to be made between the behaviour and the reward. But one problem with
continuous reinforcement is that when the rewards stop the behaviour will quickly
extinguish, i.e. no longer be performed, whereas behaviours that have been trained
using a partial schedule of reinforcement, particularly a variable ratio, have been
shown to be much more resistant to extinction. In other words, they persist longer in
the absence of reward.
Therefore, once an initial strong association has been made, the trainer is best
advised to change from rewarding the cat every time it does the right thing, to offering
rewards intermittently but with no set pattern.