The Internet Encyclopedia (Volume 3)

P1: C-46

Tuttle WL040/Bidgolio-Vol I WL040-Sample.cls June 20, 2003 17:30 Char Count= 0

VIDEOCOMPRESSIONALGORITHMS 559

gestures. In this situation, the video information can be represented by a key frame along with delta frames con- taining the changes between the frames. This is known as interframe compression. In addition, individual frames may be compressed using lossy techniques. An example of this is a technique where the number of bits representing color information is reduced and some color information is lost. This is known as intraframe compression. Com- bining the interframe and intraframe compression techniques can result in up to a 200:1 compression (Compaq, 1998). Another compression technique is called quantizing. It is the basis for most lossy compression algorithms. Es- sentially, it is a process where rounding of data is done to reduce the display precision. For the most part, the eye cannot detect these changes to the fine details (Fischer & Schroeder, 1996). An example of this type of compression is the intraframe compression described above. Another example is the conversion from the RGB color format used in computer monitors to the YcrCb format used in digital videos that was discussed in the capturing and digitizing section of this paper. Filtering is a very common technique that involves the removal of unnecessary data. Transforming is another technique, where a mathematical function is used to con- vert the data into a code used for transmission. The trans- form can then be inverted to recover the data (Vantum Corporation, 2001). For videos that have audio, the actual process used to compress audio is very different from that used to compress video even though the techniques that are used are very similar to those described above. This is because the eye and ear work very differently. The ear has a much higher dynamic range and resolution. The ear can pick out more details but it is slower than the eye (Filippini, 1997). Sound is recorded as voltage levels and it is sam- pled by the computer a number of times per second. The higher the sampling rate, the higher the quality and hence, the greater the need for compression. Compressing audio data involves removing the unneeded and redundant parts of the signal. In addition, the portions of the signal that cannot be heard are removed.

VIDEO COMPRESSION ALGORITHMS Some algorithms were designed for wide bandwidths and some for narrow bandwidths. Some algorithms were de- veloped specifically for CD-ROMs and others for streaming video. There are a number of compression algorithms available for streaming video; this chapter will discuss the major ones in use today. These algorithms are MPEG-1, MPEG-2, MPEG-4, H.261, H.263, and MJPEG. The video compression algorithms can be separated into two groups: those that make use of frame-to-frame redundancy and those that do not. The algorithms that make use of this redundancy can achieve significantly greater compression. However, more computational power is re- quired to encode video where frame-to-frame redundan- cies are utilized. As mentioned in earlier in this paper, MPEG stands for Moving Pictures Experts Group, which is a work group of the International Standards Organization (ISO) (Compaq, 1998). This group has defined several levels of standards

for video and audio compression. The MPEG standard only specifies a data model for compression and, thus, it is an open, independent standard. MPEG is becoming very popular with streaming video creators and users. The first of these standards, MPEG-1, was made available in 1993 and was aimed primarily at video conferenc- ing, videophones, computer games, and first-generation CD-ROMs. It was designed for consumer video and CD-ROM audio applications that operate at a data rate of approximately 1.5 Mbps and a frame rate of 30 frames per second. It has a resolution of 360×242 and supports play- back functions such as fast forward, reverse, and random access into the bitstream (Compaq, 1998). It is currently used for video CDs and it is a common format for video on the Internet when good quality is desired and when its bandwidth requirements can be supported (Vantum Corporation, 2001). MPEG-1 uses interframe compression to remove redundant data between the frames, as discussed in the previous section on compression techniques. It also uses intraframe compression within an individual frame as described in the previous section. This compression al- gorithm generates three types of frames: I-frames, P- frames, and B-frames. I-frames do not reference other previous or future frames. They are stand-alone or Inde- pendent frames and they are larger than the other frames. They are compressed only with intraframe compression. They are the entry points for indexing or rewinding the video, because they represent complete pictures (Compaq, 1998). On the other hand, P-frames contain predictive information with respect to the previous I or P frames. They contain only the pixels that have changed since the last frame, and they account for motion. In addition, they are smaller than the I-frames, because they are more compressed. I-frames are sent at regular intervals during transmission process. P-frames are sent at some time in- terval after the I-frames have been sent (this time inter- val will vary based on the transmission of the streaming video). If the video has a lot of motion, the P-frames may not come fast enough to give the perception of smooth motion. Therefore, B-frames are inserted between the I- and P-frames. B-frames use data in the previous I- or P-frames as well as the future I- or P-frames, thus, they are consid- ered bidirectional. The data that they contain are an in- terpolation of the data in the previous and future frames, with the assumption that the pixels will not drastically change between the two frames. As a result, the B-frames have the most compression and are the smallest of the three types of frames. In order for a decoder to decode the B-frames, it must have the I- and P-frames that they are based on; thus the frames may be transmitted out of order to reduce decoding delays (Comqaq, 1998). A frame sequence consisting of an I-frame and its fol- lowing B- and P-frames before the next I-frames is called a group of pictures (GOP) (Compaq, 1998). There are usu- ally around 15 frames in a GOP. An example of the MPEG encoding process can be seen in Figure 1. The letters I, P, and B in the figure represent the I-, P-, and B-frames that could possibly be included in a group of pictures. The letters were sized to indicate the relative size of the frame (as compared to the other frames).

The Internet Encyclopedia (Volume 3)

Get our desktop app

Company

Features

Documentation

Resources