The Internet Encyclopedia (Volume 3)

(coco) #1

P1: IML/FFX P2: IML/FFX QC: IML/FFX T1: IML


Video ̇Compression ̇OLE WL040/Bidgolio-Vol I WL040-Sample.cls September 14, 2003 18:10 Char Count= 0


DIGITALVIDEOCOMPRESSIONSTANDARDS 545

nearest I or P frames (M). Prediction errors are propa-
gated throughout the GOP and accumulate until the next
I-frame is reached. IfNis large, the accumulated error
may become unacceptable. IfMis 1, the encoder will use
only I and P frames.
A frame may be divided into slices, which contains
rows of macroblocks, from a single macroblock row to
a full frame. A macroblock contains the luminance infor-
mationYfor a 2×2 block area together with the subsam-
pled chrominance informationCB,CR, which, for 4:2:0
subsampled video sequences, may cover as small as one
block. Each block, in turn is defined to contain 8× 8
pixels over which discrete cosine transform coding is
performed (see Data Compression). MPEG-2 Test Model 5
rate control depends on adjusting the quantization of the
transform coefficients resulting from all the blocks of a
macroblock slice in a feedback loop.

MPEG-4 System Decoder
In the same way that MPEG-1 and -2 standardize only
the bit-stream syntax and decoder algorithms, MPEG-4
standardizes a system decoder model. The Koenen (2002)
overview of the MPEG-4 standard showed the concept
of encoding scenes containing visual objects and sprites.
The discrete cosine transform, quantization, inverse dis-
crete cosine transform, inverse quantization (see Data
Compression), shape coding, motion estimation, motion
prediction, and motion texture coding all combine to
optimize the visual layer of the video compression system
for specific content.

Error Resilience
Video compression typically uses interframe compres-
sion techniques to optimize the bit rate. Furthermore,
the channel coder receives variable length codes as input.
Any transmission errors, to which channels without qual-
ity of service guarantees (such as the wireless Internet)
are especially prone, may cause loss of synchronization
and the inability to reconstruct certain frames. Error re-
silience techniques such as those defined in the MPEG-4
and H.263 standards serve to reduce this risk. Resynchro-
nization markers are inserted periodically at the start of
a video packet, based on a predetermined threshold num-
ber of encoded bits and reversible variable length codes
allow some of the data between resynchronization mark-
ers to be recovered if the data have been corrupted by
errors.

Error Concealment
Koenen (2002) pointed out that a data partitioning ap-
proach can improve on the simple expedient of copying
blocks from a previous frame when errors have occurred.
A second synchronization marker is inserted into the bit
stream between the motion and the texture information.
When errors occur, the texture information is discarded
and the motion information used to compensate the pre-
vious decoded video packet. In real-time situations, where
there is a backchannel from the decoder to the encoder,
dynamic resolution conversion may be used to stabilize
the transmission buffering delay.

Content-Based Compression
The MPEG-4 standard supports content-based coding,
random-access to content objects, and extended manip-
ulation of content objects.

Shape and Texture Coding
The shape-adaptive discrete cosine transform based on
predefined orthonormal sets of one-dimensional discrete
cosine transform functions (see Kaup & Panis, 1997) can
be used to encode visual objects of arbitrary shape (not
just rectangles) together with texture.

Sprite Coding
MPEG-4 supports syntax for the efficient coding of static
and dynamically generated sprites, which are still images
representing backgrounds visible throughout a scene of a
video sequence.

Object Coding
MPEG-4 supports the coding of audiovisual objects.
Figure 3 depicts an audiovisual scene containing scrolling
text, audio, background sprite, arbitrarily shaped video,
and graphics.

General Scalability
The MPEG-4 Standard supports the scalability of visual
objects through the following profiles described in Table 3.

MPEG-7 Visual and MPEG-21 Standards
The MPEG-7 multimedia content description interface
(Martinez, 2001) supports query-by-content via descrip-
tors expressed in the extensible markup language. In the
visual case, the standard intends to convey basic and so-
phisticated information about the color, texture, shape,
motion, localization of visual objects, and the recognition
of faces.
The stated vision for the MPEG-21 multimedia frame-
work (Bormans & Hill, 2000) is “to enable transparent
and augmented use of multimedia resources across a wide
range of networks and devices.” The standard is intended
to support interoperable content representation and in-
tellectual property rights management in a “scalable and
error resilient way. The content representation of the me-
dia resources shall be synchronisable and multiplexed and
allow interaction.”
MPEG-4 includes hooks for an open intellectual
property management and protection scheme. A more in-
teroperable solution is planned for development in the
MPEG-21 standard. Bormans and Hill (2000) further
described specific interactions and showed how the
MPEG-7 standard supports transactions that produce and
consume digital data items.

ITU-T Visual Codecs
The ITU visual codecs H.261 and H.263, Video Codec for
Audio Visual Services at px64 kbit/s and Video Coding
for Low Bit-Rate Communications, are primarily used
for video conferencing applications, in which data rate
and end-to-end delay are important for lip synchroniza-
tion, a situation not encountered in broadcast applica-
tions. However, MPEG-4 contains many concepts derived
Free download pdf