The Internet Encyclopedia (Volume 3)

P1: IML/FFX P2: IML/FFX QC: IML/FFX T1: IML

Video ̇Compression ̇OLE WL040/Bidgolio-Vol I WL040-Sample.cls September 14, 2003 18:10 Char Count= 0

DIGITALVIDEOCOMPRESSIONSTANDARDS 545

nearest I or P frames (M). Prediction errors are propa- gated throughout the GOP and accumulate until the next I-frame is reached. IfNis large, the accumulated error may become unacceptable. IfMis 1, the encoder will use only I and P frames. A frame may be divided into slices, which contains rows of macroblocks, from a single macroblock row to a full frame. A macroblock contains the luminance infor- mationYfor a 2×2 block area together with the subsampled chrominance informationCB,CR, which, for 4:2:0 subsampled video sequences, may cover as small as one block. Each block, in turn is defined to contain 8× 8 pixels over which discrete cosine transform coding is performed (see Data Compression). MPEG-2 Test Model 5 rate control depends on adjusting the quantization of the transform coefficients resulting from all the blocks of a macroblock slice in a feedback loop.

MPEG-4 System Decoder In the same way that MPEG-1 and -2 standardize only the bit-stream syntax and decoder algorithms, MPEG-4 standardizes a system decoder model. The Koenen (2002) overview of the MPEG-4 standard showed the concept of encoding scenes containing visual objects and sprites. The discrete cosine transform, quantization, inverse discrete cosine transform, inverse quantization (see Data Compression), shape coding, motion estimation, motion prediction, and motion texture coding all combine to optimize the visual layer of the video compression system for specific content.

Error Resilience Video compression typically uses interframe compression techniques to optimize the bit rate. Furthermore, the channel coder receives variable length codes as input. Any transmission errors, to which channels without qual- ity of service guarantees (such as the wireless Internet) are especially prone, may cause loss of synchronization and the inability to reconstruct certain frames. Error resilience techniques such as those defined in the MPEG-4 and H.263 standards serve to reduce this risk. Resynchro- nization markers are inserted periodically at the start of a video packet, based on a predetermined threshold num- ber of encoded bits and reversible variable length codes allow some of the data between resynchronization markers to be recovered if the data have been corrupted by errors.

Error Concealment Koenen (2002) pointed out that a data partitioning ap- proach can improve on the simple expedient of copying blocks from a previous frame when errors have occurred. A second synchronization marker is inserted into the bit stream between the motion and the texture information. When errors occur, the texture information is discarded and the motion information used to compensate the previous decoded video packet. In real-time situations, where there is a backchannel from the decoder to the encoder, dynamic resolution conversion may be used to stabilize the transmission buffering delay.

Content-Based Compression The MPEG-4 standard supports content-based coding, random-access to content objects, and extended manip- ulation of content objects.

Shape and Texture Coding The shape-adaptive discrete cosine transform based on predefined orthonormal sets of one-dimensional discrete cosine transform functions (see Kaup & Panis, 1997) can be used to encode visual objects of arbitrary shape (not just rectangles) together with texture.

Sprite Coding MPEG-4 supports syntax for the efficient coding of static and dynamically generated sprites, which are still images representing backgrounds visible throughout a scene of a video sequence.

Object Coding MPEG-4 supports the coding of audiovisual objects. Figure 3 depicts an audiovisual scene containing scrolling text, audio, background sprite, arbitrarily shaped video, and graphics.

General Scalability The MPEG-4 Standard supports the scalability of visual objects through the following profiles described in Table 3.

MPEG-7 Visual and MPEG-21 Standards The MPEG-7 multimedia content description interface (Martinez, 2001) supports query-by-content via descrip- tors expressed in the extensible markup language. In the visual case, the standard intends to convey basic and so- phisticated information about the color, texture, shape, motion, localization of visual objects, and the recognition of faces. The stated vision for the MPEG-21 multimedia frame- work (Bormans & Hill, 2000) is “to enable transparent and augmented use of multimedia resources across a wide range of networks and devices.” The standard is intended to support interoperable content representation and intellectual property rights management in a “scalable and error resilient way. The content representation of the me- dia resources shall be synchronisable and multiplexed and allow interaction.” MPEG-4 includes hooks for an open intellectual property management and protection scheme. A more interoperable solution is planned for development in the MPEG-21 standard. Bormans and Hill (2000) further described specific interactions and showed how the MPEG-7 standard supports transactions that produce and consume digital data items.

ITU-T Visual Codecs The ITU visual codecs H.261 and H.263, Video Codec for Audio Visual Services at px64 kbit/s and Video Coding for Low Bit-Rate Communications, are primarily used for video conferencing applications, in which data rate and end-to-end delay are important for lip synchronization, a situation not encountered in broadcast applications. However, MPEG-4 contains many concepts derived

The Internet Encyclopedia (Volume 3)

Get our desktop app

Company

Features

Documentation

Resources