This series is excerpted from "Multidimensional Signal, Image, and Video Processing and Coding." Order this book today at
www.elsevierdirect.com or by calling 1-800-545-2522 and receive an additional 20% discount and free shipping. Use promotion code 92004 when ordering. Valid only in North America.
Part 2 looks at interframe coding, MPEG-2 & MPEG-4. Part 4 looks at Wavelet codecs.
11.3.5 Video Processing of MPEG-Coded Bitstreams
Various video processing tasks have been investigated for coded data and MPEG 2 in particular. In video editing of MPEG bitstreams, the question is how to take two or more MPEG input bitstreams and make one composite MPEG output bitstream.
The problem with decoding/recoding is that it introduces artifacts and is computationally demanding. Staying as much in the MPEG 2 compressed domain as possible and reusing the editing mode decisions and motion vectors have been found essential. Since the GOP boundaries may not align, it is necessary to recode the one GOP where the edit point lies. Even then original bitrates may make the output video buffer verifier (VBV) overflow. The solution is to requantize this GOP and perhaps neighboring GOPs to reduce the likelihood of such buffer overflow. The recent introduction of the HDV format, featuring MPEG 2 compression of HD video in camera, brings the MPEG bitstream editing problem to the forefront. Many software products are emerging and promise to edit HDV video in its so-called native mode.
The transcoding of MPEG question is how to go from a high-quality level of MPEG to a lower one, without decoding and recoding at the desired output rate. Transcoding is of interest for video-on-demand (VoD) applications. Transcoding is also of interest in video production work where short GOP (IPIP) may be used internally for editing and program creation, while the longer IBBPBBP··· long GOP structure is necessary for program distribution. Finally, there is the worldwide distribution problem, where 525 and 625 line systems3 continue to coexist. Here a motion-compensated transcoding of the MPEG bitstream is required. For more on transcoding, see Chapter 6.3 in Bovik's handbook [48].
11.3.6 H.263 Coder for Visual Conferencing
The H.263 coder from the ITU evolved from their earlier H.261, or px64 coder. As the original name implies, it is targeted at rates that are a multiple of 64 Kbps. To achieve such low bitrates, they resort to a small QCIF frame size, and a variable and low frame rate, with bitrate control based on buffer fullness. If there is a lot of motion in detailed areas that generate a lot of bits, then the buffer fills and the frame rate is reduced, i.e., frames are dropped at the encoder. The user can specify a target frame rate, but often the H.263 coder at, say 64 Kbps, will not achieve a target frame rate of 10 fps. The H.263 coder features a group of blocks (GOB) structure, rather than a GOP, with I blocks being inserted randomly for refresh. While there are no B frames, there is the option for so-called PB frames. The coder has half-pixel accurate motion vectors likeMPEG 2, and can use overlapped motion vectors from neighboring blocks to achieve a smoother motion field and reduced blocking artifacts. Also, there is an advanced prediction mode option and an arithmetic coder option.
The reason for targeting the GOB structure versus the GOP structure is the need to avoid the I frames in the GOP structure, because they require a lot of bits to transmit, a difficulty in videophone, which H.263 targets as a main application. In videophone, low bitrates and short latency requirement (≤ 200 msec) mitigate against the bit-hungry I frames. As a result, in H.263, slices or GOBs are updated by I slices randomly, thus reducing the variance on coded frame sizes that occurs with the GOP structure. High variance of bits/frame is not a problem in MPEG 2 because of its targeted entertainment applications, such as video streaming, including multicasting, digital broadcasting, and DVD. Some low bitrate H.263 coding results are contained on the enclosed CD-ROM.
11.3.7 H.264/AVC
Research on increasing coding efficiency continued through the late 1990s and it was found that up to a 50% increase in coding efficiency could be obtained by various improvements to the basic hybrid coding approach of MPEG 2. Instead of using one hypothesis for the motion estimation, multiple hypotheses from multiple locations in past frames could be used [39] together with an optimization approach to allocate the bits. By 2001, the video standards groups at ITU, Video Coding Experts Group (VCEG), and ISO MPEG convened a joint video team (JVT) to work on the new standard, to be called H.264 by ITU and MPEG 4, part 10 by the ISO. The common name is Advanced Video Coder (AVC). With reference to Figure 11-17., we see that more frame memory to store past frames has been added to the basic hybrid coder of Figure 11-9. We also see the addition of a loop filter that serves to smooth out blocking artifacts. Further, before the intra transform, there is intra, or directional spatial prediction (explained below), hence the need to switch between intra and inter prediction modes as seen in Figure 11-17.

(Click to enlarge)
Figure 11-17. System diagram of the H.264/AVC coder.
Footnotes
3. The reader should note that the 525 line system only has 486 visible lines, i.e., it is D1, which is 720× 486. A similar statement is true for the 625 line system. The remaining lines are hidden!
|