Interlaced Video – an overview | ScienceDirect Topics

Gepubliceerd op Auteur: koenarchief


Interlaced Video – an overview | ScienceDirect Topics

MPEG-1 and MPEG-2 Video Standards

Supavadee Aramvith, Ming-Ting Sun, in Handbook of Image and Video Processing (Second Edition), 2005

2.4.2 Interlaced Video Coding

Figure 11 shows the interlaced video format. As explained earlier, an interlaced frame is composed of two fields. From the figure, the top field (Field 1) occurs earlier in time than the bottom field (Field 2). Both fields together form a frame. In MPEG-2, Pictures are coded as I-, P-, and B-pictures like in MPEG-1. To optimally encode the interlaced video, MPEG-2 can encode a picture either as a field picture or a frame picture. In the field picture mode, the two fields in the frame are encoded separately. If the first field in a picture is an I-picture, the second field in the picture can be either I- or P-pictures as the second field can use the first field as a reference picture. However, if the first field in a picture is a Por B-field picture, the second field has to be the same type of picture. In a frame-picture, two fields are interleaved into a picture and coded together as one picture. In MPEG-2, a video sequence is a collection of frame pictures and field pictures.

FIGURE 11. Interlaced video format.

2.4.2-1 Frame-based and Field-based Motion Compensated Prediction.

In MPEG-2, an interlaced picture can be encoded as a frame picture or as field pictures. MPEG-2 defines two different motion compensated prediction types: frame-based and field-based motion compensated prediction. Frame-based prediction forms a prediction based on the reference frames. Field based prediction is made based on reference fields. For the simple profile where the bidirectional prediction cannot be used, MPEG-2 introduced a dual-prime motion compensated prediction to efficiently explore the temporal redundancies between fields. Figure 12 shows the three types of motion compensated predictions. Note that all motion vectors in MPEG-2 are specified with a half-pixel resolution.

FIGURE 12. Three types of motion compensated prediction.

Frame predictions in frame pictures: in the frame-based prediction for frame-pictures, as shown in Fig. 12(a), the whole interlaced frame is considered as a single picture. It uses the same motion compensated predictive coding method used in MPEG-1. Each 16 × 16 macroblock can have only one motion vector for each forward or backward prediction. Two motion vectors are allowed in the case of the bi-directional prediction.

Field prediction in frame-pictures: the field-based prediction in frame pictures considers each frame picture as two separate field pictures. Separate predictions are formed for each 16 × 8 block of the macroblock as shown in Fig. 13. Thus, the field-based prediction in a frame picture needs two sets of motion vectors. A total of four motion vectors are allowed in the case of bi-directional prediction. Each field-prediction may select either the field 1 or the field 2 of the reference frame.

FIGURE 13. Blocks for frame-/field-based prediction.

Field prediction in field pictures: in field-based prediction for field pictures, the prediction is formed from the two most recently decoded fields. The predictions are made from reference fields, independently for each field, with each field considered as an independent picture. The block size of prediction is 16 × 16; however, it should be noted that the 16 × 16 block in the field picture corresponds to a 16 × 32 pixel-area in the frame picture. Field-based prediction in field picture needs only one motion vector for each forward- or backward-prediction. Two motion vectors are allowed in the case of the bi-directional prediction.

16 × 8 prediction in field pictures: two motion vectors are used for each macroblock. The first motion vector is applied to the 16 × 8 block in field 1 and the second motion vector is applied to the 16 × 8 block in field 2. A total of four motion vectors are allowed in the case of bi-directional prediction.

Dual-prime motion compensated prediction can be used only in P-pictures. Once the motion vector “v” for a macroblock in a field of given parity (field 1 or field 2) is known relative to a reference field of the same parity, it is extrapolated or interpolated to obtain a prediction of the motion vector for the opposite parity reference field. In addition, a small correction is also made to the vertical component of the motion vectors to reflect the vertical shift between lines of the field 1 and field 2. These derived motion vectors are denoted dv1 and dv2 (represented by dash lines) in Fig. 12(c). Next, a small refinement differential motion vector, called “dmv,” is added. The choice of dmv values (−1, 0, +1) is determined by the encoder. The motion vector “v” and its corresponding “dmv” value are included in the bit-stream so that the decoder can also derive dv1 and dv2. In calculating the pixel values of the prediction, the motion compensated predictions from the two reference fields are averaged which tends to reduce the noise in the data.

Dual-prime prediction is mainly for low-delay coding applications such as videophone and video conferencing. For low-delay coding using simple profile, B-pictures should not be used. Without using bi-directional prediction, dual-prime prediction is developed for P-pictures to provide a better prediction than the forward prediction.

2.4.2-2 Frame/Field DCT.

MPEG-2 has two DCT modes: frame-based and field-based DCT as shown in Fig. 14. In the frame-based DCT mode, a 16 × 16-pixel macroblock is divided into four 8 × 8 DCT blocks. This mode is suitable for the blocks in the background or in a still image that have little motion because these blocks have high correlation between pixel values from adjacent scan lines. In the field-based DCT mode, a macroblock is divided into four DCT blocks where the pixels from the same field are grouped together into one block. This mode is suitable for the blocks that have motion because as explained, motion causes distortion and may introduce high-frequency noises into the interlaced frame.

FIGURE 14. Frame/field format block for DCT.

2.4.2-3 Alternate Scan.

MPEG-2 defines two different zigzag scanning orders: zigzag and alternate scans as shown in Fig. 15. The zigzag scan used in MPEG-1 is suitable for progressive images where the frequency components have equal importance in each horizontal and vertical direction. In MPEG-2, an alternate scan is introduced based on the fact that interlaced images tend to have higher frequency components in the vertical direction. Thus, the scanning order weighs more on the higher vertical frequencies than the same horizontal frequencies. In MPEG-2, the selection between these two zigzag scan orders can be made on a picture basis.

FIGURE 15. Progressive/Interlaced scan.

Read full chapter

MPEG-4 Visual and H.264/AVC: Standards for Modern Digital Video




Koen Quintelier
Valkstraat 61 
2860 Sint-Katelijne-Waver
M +32(0)491621271