48 Chapter 3
audio professionals choose to monitor at very high
levels? There could be many reasons. Loud levels may
be more exciting. It may simply be a matter of habit.
For instance, an audio engineer normally turns the
volume to his or her customary level fairly accurately.
Moreover, because frequency selectivity is different at
different levels, an audio engineer might choose to
make a recording while listening at a “realistic” or
“performance” level rather than monitoring at a level
that is demonstratedly more accurate. Finally, of course,
there are some audio professionals who have lost some
hearing already, and in order to pick up certain
frequency bands they keep on boosting the level, which
unfortunately further damages their hearing.
3.3.2 Masking and Its Application in Audio
Encoding
Suppose a listener can barely hear a given acoustical sig-
nal under quiet conditions. When the signal is playing in
presence of another sound (called “a masker”), the signal
usually has to be stronger so that the listener can hear it.^17
The masker does not have to include the frequency com-
ponents of the original signal for the masking effect to
take place, and a masked signal can already be heard
when it is still weaker than the masker.^18
Masking can happen when a signal and a masker are
played simultaneously (simultaneous masking), but it
can also happen when a masker starts and ends before a
signal is played. This is known as forward masking.
Although it is hard to believe, masking can also happen
when a masker starts after a signal stops playing! In
general, the effect of this backward masking is much
weaker than forward masking. Forward masking can
happen even when the signal starts more than 100 ms
after the masker stops,^19 but backward masking disap-
pears when the masker starts 20 ms after the signal.^20
The masking effect has been widely used in psycho-
acoustical research. For example, Fig. 3-10 shows the
tuning curve for a chinchilla. For safety reasons,
performing such experiments on human subjects is not
permitted. However, with masking effect, one can vary
the level of a masker, measure the threshold (i.e., the
minimum sound that the listener can hear), and create a
diagram of a psychophysical tuning curve that reveals
similar features.
Besides scientific research, masking effects are also
widely used in areas such as audio encoding. Now, with
distribution of digital recordings, it is desirable to
reduce the sizes of audio files. There are lossless
encoders, which is an algorithm to encode the audio file
into a smaller file that can be completely reconstructed
with another algorithm (decoder). However, the file
sizes of the lossless encoders are still relatively large.
To further reduce the size, some less important informa-
tion has to be eliminated. For example, one might elimi-
nate high frequencies, which is not too bad for speech
communication. However, for music, some important
quality might be lost. Fortunately, because of the
masking effect, one can eliminate some weak sounds
that are masked so that listeners hardly notice the differ-
ence. This technique has been widely used in audio
encoders, such as MP3.
Figure 3-9. Tuning curve with (solid) and without (dashed)
functioning outer hair cells. (Liberman and Dodds,
Reference 14.)
Figure 3-10. Tuning curve at various levels at a particular
location of the basilar membrane of a chinchilla. (Plack, Ref-
erence 15, p90, Ruggero et al., Reference 16.)
0.1 1.0 4.0
100
80
60
40
20
0
Treshold—dB SPL
Frequency—kHz
70 dB SPL
2 4 6 8 10 12 14 16
Frequency—Hz
Basilar Membrane velocity—dB
90
80
70
60
50
40
30
20
10
0
20 dB SPL
40 dB SPL
60 dB SPL
80 dB SPL
30 dB SPL
50 dB SPL
90 dB SPL