P1: C-172
Kroon WL040/Bidgolio-Vol I WL040-Sample.cls June 20, 2003 13:9 Char Count= 0
INTRODUCTION 309wireless networks. Errors that get introduced into the bit
stream during transmission can introduce serious degra-
dations in the decoded signal. In contrast to analog sig-
nals, where transmission impairments mainly introduce
additional noise into the audio signal, digital signals sub-
jected to bit stream errors will produce pops, clicks, and
other annoying artifacts. Especially in wireless applica-
tions, one is likely to encounter transmissions errors, and
it is common to add additional information forerror cor-
rection and detection. However, even with the use of error
correction techniques, it is still possible that bit errors re-
main in the bit stream.
The sensitivity of a decoder to random bit errors (in
other words, their relative impact on the decoded signal
quality) should be taken into account in such an applica-
tion. For wired networks the transmissions channels are
usually good, and transmission errors are unlikely. How-
ever, in packet networks (e.g., the Internet), it is possi-
ble that packets will not arrive on time. Due to the real-
time nature of the connection, the decoder cannot request
retransmission and this information is considered to be
lost. To avoids gaps in the signal it is necessary to fill in
the missing information. This technique is referred to as
error mitigation. Sophisticated error mitigation tech-
niques work quite well for segments up to 40–50 ms. For
longer error bursts it is necessary to mute the signal. Is
it important to realize that for applications where trans-
mission errors can occur, the overall quality of a coder (in-
cluding the use of error correction and mitigation) may
be dominated by its robustness to channel impairments.
Traditional compression applications are optimized
for the underlying application (e.g., a cellular system).
These systems are homogeneous, in the sense that all ter-
minals and links meet certain minimum requirements in
throughput and capabilities. The Internet is a much more
heterogeneous network, where endpoints can be quite dif-
ferent in capabilities (e.g., low-end vs. high-end, PC vs.
laptop, wired vs. wireless) and connection throughput
(dial-up vs. broadband). One solution would be to use
scalable coders in which the same coding structure can
be used for operation at different bit rates. This requires
a handshaking process between transmitter and receiver
to agree on the rate to be used. Moreover, if a throughput
issue came up somewhere in the middle of a link, it would
require decoding at the higher rate first, and then subse-
quent encoding at the lower rate. The resultingtranscod-
ingoperation introduces additional delay and a signifi-
cant degradation in quality, because coding distortions
are compounded. A better approach is the use ofembed-
dedcoders. In embedded coders there is a core bit stream
that each decoder needs to decode the signal with a certain
basic quality. One or more enhancement layers enhance
this core layer. Each enhancement layer will increase the
average bit rate and the quality of the decoded signal. The
encoder generates the core layer and enhancement layers,
but the decoder can decode any version of the signal as
long as it contains the core layer.
This is illustrated in Figure 4. Apart from adjusting
for various throughput rates, embedded coders can also
be used for temporary alleviation of congestion. If too
many packets arrive at a given switch (e.g., switch 2 in
Figure 4), the switch can decide to temporarily drop someSwitch 1Switch 2Switch 3 Switch 4(^32)
3 2 1
(^321)
NETWORK
Figure 4: Use of embedded coders in a heterogeneous
network.
enhancement packets to avoid congestion. Depending on
the size of the enhancement layer bit streams and the
coder design, this can be done with only a minor impact
on the audio quality. The embedded coding approach will
alleviate packet loss problems but will not avoid them. Be-
cause we always need the core information, the packet loss
problem will remain. A solution to this problem is the use
of multidescriptive coders. Multidescriptive coders can be
seen as a superset of embedded coders. In this the encoder
creates two or more descriptions of the signal, each of
which can be decoded independently. When all descrip-
tions are received the best possible quality is received;
when only one of the descriptions is received the quality
will be lower. In practice, it is difficult to design efficient
scalable coders and multidescriptive coders, and this topic
is still an active area of research.
Speech and Audio Quality Assessment
The quality assessment of lossy speech and audio com-
pression algorithms is a complicated issue. Quite often
we want to assess not only the quality of the compres-
sion algorithms, but also the quality delivered by these
algorithms in a typical operating scenario, which means
including other aspects of the delivery chain as well, such
as the quality of the network, or the quality and type of
playback equipment. Because the coders use perceptual
techniques, it is important to use human listeners. Even
in this case care has to be taken to get reliable and repro-
ducible results. Choices of test material, number of listen-
ers, training of listeners, test format (e.g., order of play-
back, inclusion of original), and playback scenario (e.g.,
headphones vs. loudspeakers) all affect the outcome of the
test, and it is necessary to design the test so that the impact
of these factors can be minimized. Testing perceptually
lossless coders (no audible differences when compared to
original) will be different from assessing the performance
of lossy coders. The latter will be complicated because
tradeoffs have been made in many dimensions, which will
produce different listener responses. For example, some
people prefer large audio bandwidth to reduced signal dis-
tortion. Some stereo audio coders trade off more subtle
issues such as stability of the reproduced stereo image. A
well-designed test will try to eliminate all these biases and
should produce a reproducible result.