The Internet Encyclopedia (Volume 3)

P1: C-172

Kroon WL040/Bidgolio-Vol I WL040-Sample.cls June 20, 2003 13:9 Char Count= 0

308 SPEECH ANDAUDIOCOMPRESSION

Frequency Range (Hz)

10 20 50 200 3400 7000 15k 20k

TELEPHONE

WIDEBAND

FM-QUALITY (mono)

COMPACT DISC (mono)

kb/s 705.6

512

256

64

Figure 1: Relationship between bandwidth and bit rate.

decoderor stored for later retrieval. The decoder takes the bit stream and generates the corresponding decoded PCM signal, which is a rendering of the original input PCM signal. In the remainder of this chapter we will focus on reduc- ing the bit rates of speech and audio signals while provid- ing the best possible signal quality. At this point it is good to point out that quality is a difficult attribute to quantify because it has many dimensions. Most of these are asso- ciated with specific applications, and it is important to set proper objectives when designing or choosing a particular compression algorithm. For example, for speech communications it could be important to have consistent performance for various input conditions, such as clean speech, noisy speech, and input level variations. For compression of music signals it could be important to have consistent performance for various types of music, or to preserve audio bandwidth and stereo image as much as possible. As will be clear later, this quality objective will be constrained by other factors such asdelay(the time needed to encode and decode a signal) and thecomplexity(the number of arithmetic operations) of the methods used. Speech and audio compression applications can be divided into two classes. The first isbroadcasting(e.g., streaming); the other iscommunication(e.g., Internet telephony). Each application has different requirements, as can be seen from Table 1. There are two principal approaches to the compression of digital signals:losslessandlossycompression. Loss- less compression techniques take advantage of redundan- cies in the numerical representation. For example, instead of using 16 bits per sample uniformly, one could use a new mapping that assigns symbols of shorter length to the most frequent values. Or if the signal values change slowly between sample values, one could encode the differences instead, using fewer bits. Lossless compression is a reversible operation and the input and output signal samples of Figure 2 will be identical. Lossy compression techniques, on the other hand, assume that the signal has

encoder transmission or storage

PCMinput decoder PCMoutput bitstream bitstream

Figure 2: Block diagram of a generic coding or compression operation.

Table 1Difference in Requirements for Broadcasting and Communications

Application Broadcasting Communications Characteristic One-way Two-way transmission transmission Delay Not important Important Complexity Not important Important Technology Audio coding Speech coding

a human destination, which means that signal distortions can be introduced as long as the listener either is not able to hear them, or has no serious objections. Lossy compression is not reversible and the input and output signal samples of Figure 2 will be different. For many signals this difference would be unacceptable, but for audio signals, one only worries about audible differences. If the differences are inaudible the lossy coding techniques used are often referred to asperceptually losslesscoding techniques. But even if the differences were clearly audible it still would be acceptable for many applications. A good example is the difference between telephone speech and natural speech, where the telephone signal is significantly limited in bandwidth (typically less than 4 kHz). In practice, the use of lossless coding for audio and speech signals results in limited compression efficiency and its use is restricted to high-quality applications such as the audio DVD. The compression efficiency for lossy coding can be significantly higher and consequently per- ceptually lossless and lossy coding are the main approaches used in most audio and speech compression applications. Speech compression takes this approach one step further by also assuming that humans generate the source signal, which gives it certain properties that can be taken advantage of by the compression algorithms. As a result, speech compression can achieve very high compression efficiency. Figure 3 gives a summary of the typical bit rates and applications.

Compression for Packet Networks The compression algorithms described in this chapter are used for communication and broadcasting over wired and

1 2 4 8 16 32 64 128 1 2 4 kb/s Mb/s

secure voice

wireless

voice messaging

network telephony

video conferencing

audio broadcasting

Quality

audio storage CD quality FM quality

wideband quality

telephone quality

Figure 3: Typical bit rates and applications for speech and audio compression.

The Internet Encyclopedia (Volume 3)

Get our desktop app

Company

Features

Documentation

Resources