Audio Engineering

(Barry) #1

622 Chapter 21


the frequency range 2–5 kHz. Whether or not you hear a sound therefore depends on the
frequency of the sound and the amplitude of the sound relative to the threshold level for
that frequency.


The threshold of hearing adapts to the sounds that are heard, so that the threshold
increases greatly, for example, when loud noises accompany soft music. The louder sound
masks the softer, and the term masking is used of this effect. Note that this is in direct
contradiction of the “ cocktail-party effect, ” which postulates the ability of the ear to focus
on a wanted sound in the presence of a louder unwanted sound.


The masking effect is particularly important in orchestral music recording. When a full
orchestra plays fortissimo then the instruments that contribute least to the sound are,
according to many sources, not heard. A CD recording will contain all of this information,
even if a large part of it is redundant because it cannot be perceived. By recording only
what can be perceived, the amount of music that can be recorded on a medium such as a
CD is increased greatly, which can be done without any perceptible loss of audio quality.


Musicians will feel uneasy about this argument because they and many others feel that
every instrument makes a contribution. Can you imagine what an orchestra would sound
like if the softer instruments were not played in any fortissimo passage? Would it still be
fortissimo? Would we end up with a brass band, without strings or woodwinds? My own
view is that the masking theory is not applicable to live music, but it may well apply to
sound that we hear through the restricted channels of loudspeakers. In addition, how will
a compressed recording sound when compared to a version using HCDC technology?


MPEG coding starts with circuitry described as a perceptual subband audio encoder.
The action of this section is to analyze continually the input audio signal and, from this
information, prepare data (the masking curve) that defi ne the threshold level below which
nothing will be heard. The input is then divided off in frequency bands, called subbands.
Each subband is quantized separately, controlling the quantization so that the quantization
noise will be below the masking curve level for that subband. Data on the quantization
used for a subband are held along with the coded audio for that subband so that the decoder
can reverse the process. Figure 21.2 shows the block diagram for the encoding process.


21.4.1 Layers


MPEG1, as applied to audio signals, can be used in three modes, called layers I, II, and
III. An ascending layer number means more compression and more complex encoding.

Free download pdf