P1: C-172
Kroon WL040/Bidgolio-Vol I WL040-Sample.cls June 20, 2003 13:9 Char Count= 0
308 SPEECH ANDAUDIOCOMPRESSIONFrequency Range (Hz)10 20 50 200 3400 7000 15k 20kTELEPHONEWIDEBANDFM-QUALITY (mono)COMPACT DISC (mono)kb/s
705.651225664Figure 1: Relationship between bandwidth and bit rate.decoderor stored for later retrieval. The decoder takes the
bit stream and generates the corresponding decoded PCM
signal, which is a rendering of the original input PCM
signal.
In the remainder of this chapter we will focus on reduc-
ing the bit rates of speech and audio signals while provid-
ing the best possible signal quality. At this point it is good
to point out that quality is a difficult attribute to quantify
because it has many dimensions. Most of these are asso-
ciated with specific applications, and it is important to set
proper objectives when designing or choosing a particular
compression algorithm. For example, for speech commu-
nications it could be important to have consistent perfor-
mance for various input conditions, such as clean speech,
noisy speech, and input level variations. For compression
of music signals it could be important to have consistent
performance for various types of music, or to preserve au-
dio bandwidth and stereo image as much as possible. As
will be clear later, this quality objective will be constrained
by other factors such asdelay(the time needed to encode
and decode a signal) and thecomplexity(the number of
arithmetic operations) of the methods used.
Speech and audio compression applications can be
divided into two classes. The first isbroadcasting(e.g.,
streaming); the other iscommunication(e.g., Internet
telephony). Each application has different requirements,
as can be seen from Table 1.
There are two principal approaches to the compression
of digital signals:losslessandlossycompression. Loss-
less compression techniques take advantage of redundan-
cies in the numerical representation. For example, instead
of using 16 bits per sample uniformly, one could use a
new mapping that assigns symbols of shorter length to
the most frequent values. Or if the signal values change
slowly between sample values, one could encode the dif-
ferences instead, using fewer bits. Lossless compression
is a reversible operation and the input and output signal
samples of Figure 2 will be identical. Lossy compression
techniques, on the other hand, assume that the signal hasencoder transmission
or storagePCMinput decoder PCMoutput
bitstream bitstreamFigure 2: Block diagram of a generic coding or compression
operation.Table 1Difference in Requirements for Broadcasting
and CommunicationsApplication Broadcasting Communications
Characteristic One-way Two-way
transmission transmission
Delay Not important Important
Complexity Not important Important
Technology Audio coding Speech codinga human destination, which means that signal distortions
can be introduced as long as the listener either is not able
to hear them, or has no serious objections. Lossy com-
pression is not reversible and the input and output signal
samples of Figure 2 will be different. For many signals
this difference would be unacceptable, but for audio sig-
nals, one only worries about audible differences. If the
differences are inaudible the lossy coding techniques used
are often referred to asperceptually losslesscoding tech-
niques. But even if the differences were clearly audible it
still would be acceptable for many applications. A good
example is the difference between telephone speech and
natural speech, where the telephone signal is significantly
limited in bandwidth (typically less than 4 kHz).
In practice, the use of lossless coding for audio and
speech signals results in limited compression efficiency
and its use is restricted to high-quality applications such
as the audio DVD. The compression efficiency for lossy
coding can be significantly higher and consequently per-
ceptually lossless and lossy coding are the main ap-
proaches used in most audio and speech compression ap-
plications. Speech compression takes this approach one
step further by also assuming that humans generate the
source signal, which gives it certain properties that can
be taken advantage of by the compression algorithms. As
a result, speech compression can achieve very high com-
pression efficiency. Figure 3 gives a summary of the typical
bit rates and applications.Compression for Packet Networks
The compression algorithms described in this chapter are
used for communication and broadcasting over wired and1 2 4 8 16 32 64 128 1 2 4
kb/s Mb/ssecure voicewirelessvoice messagingnetwork telephonyvideo conferencingaudio broadcastingQualityaudio storage
CD quality
FM qualitywideband qualitytelephone qualityFigure 3: Typical bit rates and applications for speech and
audio compression.