Handbook for Sound Engineers

Psychoacoustics 59

There are two challenges when using spectral cues.
The first is discriminating between the filtering feature
and the spectrum of the source. For instance, if one
hears a notch around 9 kHz, it might be due to an HRTF,
or the original source spectrum might have a notch
around 9 kHz. Unfortunately there is no simple way to
discriminate between them. However, for a familiar
sound (voice, instruments, etc.) with a spectrum known
to the auditory system, it is easier to figure out the
HRTFs and thus easier to localize the source than an
unknown sound. If one has trouble discriminating
sounds along the cone of confusion, one can use the
cues of head motion. For example, suppose a listener
turns his or her head to the left. If the source moves to
the right, the source is in front; whereas if the source
moves farther to the left, it must be in the back. The
second challenge is the individuality of HRTFs. No two
people share the same pinna and head shape, and we
have learned our own pinnae and head size/shape over

years of experience. If one listens to sounds convolved with the HRTFs of someone else, although the left–right localization will be good, there will be a lot of front–back confusion,^47 unless the listener’s head and ears happen to be similar in size and shape to those whose HRTF is measured.^48 The human binaural system is remarkably adaptive. Experiments with ear molds^49 show that, if a subject listens exclusively through another set of ears, although there is originally a lot of front–back confusion, in about 3 weeks, the subject will learn the new ears and localize almost as well as with their original ears. Instead of forgetting either the new or the old ears, the subject actually memorizes both sets of ears, and becomes in a sense bilingual, and is able to switch between the two sets of ears.

3.11.3 Externalization

Many listeners prefer listening to music through loudspeakers instead of through headphones. One of the rea- sons is that when listening through headphones, the pinnae are effectively bypassed, and the auditory system is not receiving any of the cues that the pinnae produce. Over headphones, the instruments and singers’ voices are all perceived or localized inside the head. When listening through loudspeakers, although the localization cues are not perfect, the sounds are externalized if not localized, somewhat more naturally. If, however, music playing through the headphones includes the HRTFs of the listener, he or she should be able to externalize the sound perfectly.^50 Algorithms are available to simulate 3D sound sources at any location in free field and in a regular room with reverberation. The simulation is accurate to up to 16 kHz, and listeners cannot discriminate between the real source and the virtual (simulated) sound.51,52 An inconvenience nevertheless is that the system has to be calibrated to each listener and each room. In 1985, Jones et al.^53 devised a test for stereo imagery utilizing a reverberator developed at the North- western University Computer Music Studio. The reverberator utilized HRTFs to create very compelling simulations of 3D space and moving sound sources within 3D space. The test by Jones et al.,^53 called LEDR (Listening Environment Diagnostic Recording) NU™, contained sound examples that moved in very specific sound paths. When played over loudspeaker systems that were free from phase or temporal distortions and in environments free from early reflections, the paths were perceived as they were intended. In the presence of early reflection or misaligned crossovers or drivers, the paths are audibly corrupted.

Figure 3-21. Cone of confusion for a spherical head with
two holes at the ear positions. If only ITD cues are available,
the listener cannot discriminate positions on the surface of
the cone of confusion, corresponding to a given ITD. If ILD
cues are also available, due to the diffraction of the head,
the listener can further limit the confusion range into a cir-
cle (the dark “donut” on the figure).

Figure 3-22. Head-related transfer functions. Each curve
shows the filtering feature (i.e., the gain added by the exter-
nal ear at each frequency) of an incident angle. This figure
shows the orientations in the horizontal plane. The angles
are referenced to the medial sagittal plane, ipsilateral to the
ear. The angle of 0q is straight ahead of the subject.

Frequency–kHz

Gain–dB

25 20 15 10 5 0 −5

+45° 0° +90° +135° 0°

+45°

+90° +135°

0.2 0.3 0.4 0.5 0.7 1 2 3 4 5 7 10 12

Handbook for Sound Engineers

Get our desktop app

Company

Features

Documentation

Resources