As we've highlighted hardware technologies at DVDFILE, we've touched upon
progressive scan DVD players and line doublers that convert interlaced video
to progressive video. Some home theater components that produce progressive
scan video have the commendable ability to correct for most of the
unavoidable visual distortions caused by the 3:2 pulldown process. Or tech
editor was so impressed with the Toshiba progressive scan DVD player he
reviewed that it became his reference unit. So let's take a look at the 3:2
pulldown process, discuss why it's necessary, describe the artifacts, and
consider how we might get rid of its visual distortions.
Video versus Film
You wouldn't have to read this article if film and video weren't so
different. It's due to those differences that the 3:2 pulldown process
becomes necessary. So to begin, we should examine the nature of video and
film.
NTSC (National Television Standards Committee) Video is composed of 525
horizontal scan lines. In our wonderful world of DVD, 480 of those scan
lines are available to contain picture information. NTSC video is
interlaced. In other words, even though the video is shown at 29.97 frames
(pictures) per second, each video frame contains two video fields. One field
is composed of all the odd horizontal scan lines; the other contains all the
even horizontal scan lines. So despite the reality that NTSC video displays
29.97 frames or pictures each second, it's actually created as 59.94 fields
per second. Consequently, any four sequential video frames (A, B, C, D) are
drawn on the video display as A1, A2, B1, B2, C1, C2, D1, D2, where the 1 or
2 represents the field number within the frame. This is what the vast
majority of video displays - including, most likely, yours - expects from
any video signal source.
Conventional 35 mm and 70 mm film is shot at 24 frames per second. On the
motion picture screen, visible flicker is minimized by projecting the film
at 48 frames per second. To maintain proper speed onscreen, the projector
repeats each frame. So any four film frames would be projected as A, A, B,
B, C, C, D, D. Since the frame rates of film and NTSC video are quite
different (24 film frames per second as opposed to 29.97 video frames per
second), when we transfer film to video or try to display film from video,
we have a bit of a problem.
Simply transferring each film frame onto each video frame would result in a
film running about 24.9% faster than intended; 29.97 film frames would be
shown during each second rather than the correct 24. The clever solution to
this problem is to repeat film frames periodically in a very straightforward
but mathematically significant way; the resultant redundancy prevents the
apparent speedup of the film when shown at the conventional video frame
rate. This is how it's done.
Telecine to NTSC Video
The telecine machine used to transfer film to video for composite D2 masters
(which may be used for VHS cassettes, laserdiscs, and broadcast) projects
film onto a video imager at 59.94 frames per second (identical to and
synchronized with the video field rate) and repeats film frames in a
recurring 3:2 pattern. In other words, the film frame sequence is A, A, A,
B, B, C, C, C, D, D, and so on:
The Telecine 3:2 Pulldown Process for NTSC Video
The first film frame, A, is repeated three times and is recorded as field 1
and field 2 of the first video frame, and field 1 of the second video
frame. The second film frame, B, is repeated twice and is recorded as field
2 of the second video frame and field 1 of the third video frame. The third
film frame, C, is repeated three times and is recorded as field 2 of the
third video frame and fields 1 and 2 of the fourth video frame. The fourth
film frame, D, is repeated twice and is recorded as field 1 and field 2 of
the fifth video frame. See the pattern?Repeat this sequence six times and 24
frames of film become 30 frames of video.
MPEG-2 and DVD
So the basis of this technique is to restore proper timing by generating
redundant image information from four film frames within every five NTSC
video frames. But wouldn't it be silly to waste 20% of the storage space on
every DVD with duplicate picture data? Fortunately the MPEG-2 standard nicely
avoids this inefficiency. When a film source is encoded for presentation on
DVD, it is stored at 24 frames per second; each video frame contains all the
picture information from each film frame. There is no redundancy or
duplication. Such a transfer is written to DVD as 720-pixel wide by
480-pixel high interlaced frames (where each frame contains two 720 by 240
fields), and there are only 24 frames for each second of film. This is known
as 480i24. On each DVD encoded from a film source, a flag is inserted within
the MPEG-2 data stream that instructs the player to repeat certain fields to
reconstruct the 29.97 frame per second interlaced video. The player obliges
by performing the 3:2 pulldown in real-time, continually creating interlaced
frame sequences just like the one shown in the above figure, "The Telecine
3:2 Pulldown Process for NTSC Video." This capability enables the player to
produce video compatible with conventional displays that were designed based
on the NTSC video standard. (As we shall see later, progressive scan DVD
players take a different approach.)
The Downside
While the 3:2 pulldown process restores the proper speed of the film on
video, it generates some unpleasant problems. Two sequential video frames
within every five video frame sequence contain images from different film
frames. If there is movement of the images on film, 40% of the video frames
will contain visually distorted information. Let me steal a figure from my
Anamorphic Widescreen piece to demonstrate.
Stationary Camera Camera Panning Left
The video frame on the left is fine. The circle on film was quite still and
so the odd and even scan lines paint a stable video picture. Now let's pan
left on film, causing an apparent motion of the circle to the right. Notice
that within the video frame on the right - one of the two frames in the five
video frame sequence that contains fields from two different film frames -
the circle is in one position based on the odd scan lines and in another
position based on the even scan lines. What a mess. Repeat this process 40%
of the time and the eye sees a loss of focus, a smearing of detail, for any
moving object. For those frames that contain a quick cut from one scene to
another, the image may become even odder:
Scene Change
Here, the film editor has cut from our image of the black circle to images
of a green rectangle to the right and part of a blue cone to the left. For a
video frame that captures images from these two different film frames, one
before and one after the scene change, all three objects appear on the video
display for the duration of the video frame. For that brief snatch of time
(33.37 msec), our vertical resolution has been cut in half for that film
frame.
These are the two spatial artifacts caused by interlacing; I'll touch upon a
temporal artifact soon. Now, let's see if we can minimize these spatial
flaws as we watch our DVDs.
Line Doublers
Let's start with the solution that's been around the longest, reverse the
3:2 pulldown as the video is converted within a line doubler from interlaced
video to progressive video:
Each progressive video frame is reconstructed by weaving together the odd
and even fields from images that were derived from the same film frame. The
video frames are then shown at double the conventional NTSC video frame rate
in a 3:2 repeating pattern. This effectively doubles the number of
horizontal scan lines during each second, hence the name of the instrument
that performs the work: a line doubler.(It isn't clear whether C2 is derived
from interlaced frame 3 or interlaced frame 4; I arbitrarily showed frame
4.)The technically astute might notice that this scheme seems to require
going forward in time to reconstruct frame B. Actually, frame buffers and
double buffering techniques are used to overcome this problem. This implies
that there is a slight delay through such complex video circuitry, but
experience has shown that synchronization with non-delayed audio is not an
issue.
While lacking strict temporal consistency - every other film image is on the
screen for one and a half times longer than the previous film image causing
an apparent subtle jerking or juddering during smooth scans - the spatial
distortion of interlacing fields from two different film frames is
gone. Since the reverse 3:2 pulldown is somewhat complex, requiring some
field analysis on the fly to get it right, it's not available in all line
doublers. You'll find this feature in the surprisingly inexpensive DVDO line
doubler and such high-end video processors as those from
Faroudja. Variations on this theme may be found in some quadruplers,
interpolators, and scalars, but that's a topic for some other time.
Progressive Scan DVD Players
As I mentioned earlier, film is stored on DVD as 480i at the equivalent of
24 frames per second. When a conventional player recognizes the appropriate
MPEG-2 frame repeat flag, it performs the 3:2 pulldown in real-time, but
progressive scan players can react to this flag in a different way. Such a
player can create progressive video in real-time .It reconstructs each video
frame by weaving together its odd and even fields, then repeats the video
frames in a recurring 3:2 pattern. The resulting video signal will contain
the same frame sequence and the same horizontal and vertical scan rates as
are produced by the line doubler. This is a simpler process than is required
in a line doubler since the player does not have to examine the fields to
determine how to perform the weaving; no DVD derived from film contains a
video frame with images from two film frames.
One potential advantage of performing this process within the DVD player is
that it's done entirely in the digital domain, so no signal degradation
occurs. An external line doubler accepts a DVD's video signal in analog
form, such as component or S-video. The line doubler must digitize the video
to bring it into its digital processing circuitry. The line-doubled digital
video is then transformed to analog once again for compatibility with the
video display. With no less than an analog buffer, an anti-aliasing filter,
a sample-and-hold, an analog-to-digital converter, a digital-to-analog
converter, another anti-aliasing filter, and another analog buffer involved
in the conversions from analog to digital to analog, there's quit a bit of
circuitry that can get in the way of a pristine signal. Only the most
expensive video processors, costing thousands of dollars, will perform these
tasks without visibly degrading the video.
Please note that for a video display to properly present such progressive
video or line-doubled signal, it must be capable of dealing with about
31,500 scan lines per second - twice the normal rate. Interestingly, the
vertical sync rate remains the same as conventional NTSC video, 59.94 Hz.
The Computer and a Possible Future
Because many computer displays are capable of broader ranges of horizontal
and vertical scan rates, it is possible to create temporally symmetrical
progressive video that runs at two or three times the film's frame rate: 48
or, more commonly, 72 frames per second. To maintain proper timing, each
frame must be repeated two or three times, respectively, so the sequence
becomes A, A, B, B, C, C, D, D or A, A, A, B, B, B, C, C, C, D, D, D. Each
film image is shown on the video display for precisely the same amount of
time, creating our temporal symmetry. So not only have we eliminated the
spatial distortions, juddering during smooth pans is now gone as well.
48 frames per second require a horizontal scan rate of 25,200 Hz and a
vertical sync rate that extends down to 48 Hz. 72 frames per second require
a horizontal scan rate of 37,800 Hz and a vertical sync rate that extends up
to 72 Hz. Interestingly, many front projectors are capable of these
rates. I've received e-mail from home theater enthusiasts who prefer to use
their computers as DVD players to take advantage of this flavor of
progressive scan on such projectors. I suspect that as more capable video
displays become readily available, we may see standalone progressive DVD
players that offer the 48 or 72 frame per second playback option.
Parting Thoughts
It's quite remarkable how much the image quality can be improved by
eliminating 3:2 pulldown artifacts with an appropriate reverse process while
converting to progressive video. Throw in a good anamorphic transfer with no
edge enhancement and the presentation is surprisingly film-like. Interested?
You can expect progressive scan DVD players to be introduced by several
manufacturers this year. Or you might want to investigate a line doubler,
perhaps the affordable DVDO. And as HDTV-ready display prices come down,
more and more of you will be able to enjoy the best home theater currently
has to offer.
Here are few of my command line options for *DVD*s:
Mencoder
mencoder -ofps 24000/1001 -mc 1 -ovc lavc -oac lavc \
-lavcopts acodec=ac3:abitrate=160:vcodec=mpeg2video:lmin=1:l max=10:\
keyint=15:trell:vmax_b_frames=2:mbd=2:precmp=2:sub cmp=2:cmp=2:dia=-10:\
predia=-10:cbp:mv0:dc=10:vstrict=-1:vrc_buf_size=1835:vbitrate=6000:\
vrc_maxrate=9800:vqscale=4 -af volume=+0dB,volnorm=2,lavcresample=48000 \
-srate 48000 -channels 2 -sws 9 -vf denoise3d=1.33:1:2,pp=hb/vb/dr/al/lb,\
scale=716:476,expand=720:480,scale=720:480,harddup -of mpeg -mpegopts \
format=dvd:vaspect=16/9:telecine:init_vpts=400:init_apts=400 \
-mc 1 -o output_file.mpg input_file.avi
This is NTSC encoded file with 3:2 pulldown. Means that file is encoded at
23.976 fps (24000/1001), while dvd player is tricked to think it is 29.97
(that's what 'telecine' option does in mpegopts). This way we have much
smaller file with same quality. For PAL encodings use '-ofps 25' and exclude
'telecine' in mpegopts.
In lavcopts we have some settings set to achieve very high quality
encodings. You can change abitrate to ie 224 if you want higher quality
sound. Option vqscale sets constant quantizer (4 in this case), lower the
quant, better the quality, but also bigger file. Quantizer 1 is not
recomended (compatibility issues). If you want fixed size file, you should
use 2-pass encoding.
Video filters ('-vf' option) sets some filters. Include 'al' (autolevels)
only if your video source has washed-up look. 'scale' settings are set to
give 2px border around video (giving smaller size encoding, those 2 pixels
are not visible on most tvs).
Note that those settings are tweaked for widescreen material, you should
change few things (like autoaspect option in lavcopts), but I did not bother
to check. Also '-of mpeg' is still in experimental stage, so output file
might need to be remultiplexed (this issue might be solved in mencoder pre8
that's been released recently). Read man for more info.
Transcode
transcode --print_status 5 --export_fps 23.976,1 -J hqdn3d=4:3:6:4:pre,pp=hb:vb:dr:al,normalize,text=string="My Cool Video":fade=2:range=150-350:font=/usr/share/fonts/truetype/ttf-bitstream-vera/Vera.ttf:pos=500x50:points=10 --video_max_bitrate 7500 -V -j0,-2 --export_prof dvd-ntsc --export_asr 3 -y mpeg2enc,ffmpeg -F 8,"-K kvcd" -N 0x2000 -R 3 -w 7,12,90 --pulldown -E 48000 -D 0 -b 160 -i input_file.avi -o output_file -m output_file.ac3
Recently I've discovered that transcode with mpeg2enc gives even better
results than mencoder, only it can be very slow sometimes. Source is also
ntsc pulled down to 29.97. Some cool options are in '-J'. First - filters
used are mplayer ones, same as mecoder; do not use 'al' unless for washed-up
videos. 'normalize' will normalize your sound. 'text' is the easiest way to
add text to your video (range is frame start/end).
-
'—export_asr 3' is for widescreen.
-
'-y' tells transcode to use mpeg2enc module for video and ffmpeg for audio. You can use ffmpeg for video to get some speed, but will lose some quality.
-
-F is for passing options to mpeg2enc module.
-
'8' is for dvd format
-
"-K kvcd" is for matrix (very powerfull feature; kvcd is matrix used for tweaking mpeg to produce smaller files - around 15%; other options are tmpgenc|default|hi-res - check man mpeg2enc).
-
'-N 0x2000' is for ac3 output.
-
'-R 3' in combination with '-w' gives constant quantizer (7 in this case, mpeg2enc related, corresponds to mencoder's q4).
-
'12' is keyframe setting and '90' is crispness/smoothness factor.
Again change '-b' option to 224 if you want higher quality sound.
Once encoding is finished, you need to multiplex video/audio files:
mplex -S 4700 -f 8 -M -V -o output_file.mpg video.m2v audio.ac3
Transcode will ensure that you have very compliant dvd video with superb
quality.
Last edited by encho : July 5th, 2006