Handbook for Sound Engineers

(Wang) #1

58 Chapter 3


a pure tone, ITD and IPD are linearly related. The ITD
and IPD are also referred to as the interaural temporal
difference. In summary, for localization of left and right,
there are two cues—i.e., the ILD and ITD cues.


Adjusting either the ILD or the ITD cues can affect
sound localization of left and right. In reality, both of
those cues vary. There are limits for both cues. In order
to better localize using the ILD cues, the interaural
differences should be greater. Because of diffraction
around the head, at frequencies below 1 kHz, the levels
at both ears are similar, leading to small ILD cues.
Therefore ILD cues are utilized at high frequencies,
when the head shadow has a big effect blocking the
contralateral ear (the one not pointing at the source). On
the other hand, there is a limit for the ITD cues as well.
At frequencies above 700 Hz, the IPD of a source at
extreme left or right would exceed 180q. For a pure
tone, this would lead to confusion: a tone far to the right
might sound to the left, Fig. 3-20. Since we care most
about the sound sources in front of us, this limit of
700 Hz can be extended upward a little bit. Further-
more, with complex signal with broad bandwidth, we
can also use the time delay (or phase difference) of the
low-frequency modulation. In general, the frequency of
1.2 kHz, (or a frequency range between 1 and 1.5 kHz)
is a good estimate for a boundary, below which ITD
cues are important, and above which ILD cues are
dominant.


In recording, adjusting the ILD cues is easily
achieved by panning between the left and right chan-
nels. Although adjusting ITD cues also move the sound
image through headphones, when listening through


loudspeakers, the ILD cues are more reliable than ITD
cues with respect to the loudspeaker positions.

3.11.2 Localization on Sagittal Planes

Consider two sound sources, one directly in front of and
one directly behind the head. Due to symmetry, the ILD
and ITD are both zero for those sources. Thus, it would
seem that, by using only ILD and ITD cues, a listener
would not be able to discriminate front and back
sources. If we consider the head to be a sphere with two
holes at the ear-positions (the spherical head model), the
sources producing a given ITD all locate on the surface
of a cone as shown in Fig. 3-21. This cone is called the
cone of confusion.^46 If only ITD cues are available for a
listener with a spherical head, he or she would not be
able to discriminate sound sources on a cone of confu-
sion. Of course, the shape of a real head with pinnae is
different from the spherical head, which changes the
shape of the cone of confusion, but the general conclu-
sion still holds. When the ILD cues are also available,
due to diffraction of the head (i.e., the head shadow), the
listener can further limit the confusion into a certain
cross-section of the cone (i.e., the dark “donut” in
Fig. 3-21). That is the best one can do with ILD and
ITD cues. However, in reality, most people can easily
localize sound sources in front, in the back, and above
the head, etc, even with eyes closed. We can localize
sources in a sagittal plane (a vertical plane separating
the body into, not necessarily equal, left and right parts)
with contribution of the asymmetrical shape of our pin-
nae, head, and torso of our upper body. The pinnae are
asymmetrical when looked at from any direction. The
primary role of the pinna is to filter or create spectral
cues that are virtually unique for every angle of inci-
dence. Different locations on the cone of confusion will
be filtered differently, producing spectral cues unique to
each location.
The common way of describing the spectral cue for
localization is the head-related transfer function
(HRTF), an example of which is shown in Fig. 3-22. It
is the transfer function (gain versus frequency), illus-
trating the filtering feature of the outer ear, for each
location in space (or, more often, for each angle of inci-
dence). Nowadays, with probe microphones inserted
close to the eardrum, HRTF can be measured with high
accuracy. Once it is obtained for a given listener, when
listening to a recording made in an anechoic chamber
convolved with the proper HRTF, one can “cheat” the
auditory system and make the listener believe that the
recording is being played from the location corre-
sponding to the HRTF.

Figure 3-20. Confusion of interaural phase difference (IPD)
cues at high frequencies. The dashed curve for the left ear
is lagging the solid curve for the right ear by 270q, but it is
confused as the left ear is leading the right ear by 90q.

Free download pdf