.....
WHY COMPRESSION?
----------------
Snapz Pro 2 saves the video you record in a proprietary (compressed)
file format as you record, and converts it to QuickTime (with whatever
codec settings you have chosen) after the movie is done recording. Why
bother with this step? Why, I'm glad you asked!
Firstly, in order to get an accurate capture of image frames for the
QuickTime movie, Snapz Pro 2 operates via interrupt -- this means that,
say, 10 times a second, we get a chance to capture the screen no matter
what is going on at the time. Unfortunately, it is not possible (nor
desirable in some cases) to reliably call QuickTime at interrupt time
(sometimes it will work, sometimes it will crash), so we need to use our
own interim format.
Because we control the compression code, we can make sure it is
optimized for this specific application (capturing and compressing
screen images on the fly).
Additionally, some QuickTime codecs can take a significant amount of
time to do their compression. Our design goal with Snapz Pro 2 was to
make the video capturing/writing to disk as fast and accurate as
possible, and then let QuickTime do its thing at its own pace in the
post-processing stage.
Finally, we need to dub our audio tracks (if any) onto the movies
anyway, which would have to be done in a seperate step in any event.
It was a good bit of work to properly do the video capture that Snapz
Pro 2 does, but I think the results are well-worth it.
COMPRESSION SCHEME
------------------
The official buzzword savvy name for Snapz Pro 2's compression is "frame
differential planar RLE compression". It's certainly not a new concept,
but it works well for what we're doing. Here's what that means in a
nutshell:
"RLE" stands for run length encoding; here's how it works. If you have
a line that has 20 white pixels in it, side by side, instead of storing
20 white pixels, we simple store 20 0 (assuming 0 represents the color
white). It's like the difference between having a dollar and having 100
pennies -- they both mean the same thing, but the dollar takes up a heck
of a lot less space and weight.
"Planar" simply means that in direct-pixel color modes (16 and 32 bit)
we seperate the pixels into "planes" of red, green, and blue, and the
compress each plane independently using the RLE described above. This
allows them to compress much better than if they were left in their
normal r,g,b "chunky" format.
"Frame differential" means that we store the results of the last frame's
image in memory, and compare it to the current frame. Only the pixels
that actually have changed between the two frames are recorded and
compressed via the planar RLE described above. If you think about
typical computer screens, very little changes from frame to frame
(mostly the cursor moving, and a few minor things drawing) so it makes
sense to record only the changes.
The combination of these techniques results in some pretty good
compression for typical on-screen activities, and more importantly, it's
done quickly and reduces the overhead it takes to write the frame to
disk.
To see what this technique looks like in action, take a look at:
temporal.mov and mantis.mov (both are under 200k in size). Yes, most of
the frames are supposed to be white -- this movie shows how Snapz Pro 2
records *only* the pixels that have actually changed between frames. Try
single-stepping through each movie to see it in better detail.
COMPRESSION STATS
-----------------
Here are some stats for you folks, so you can understand how the RLE and
RLE/differential compression helps.
In this test, the selection size was set to 304x304, and the monitor was
set to 16 bit mode. The size of the raw pixels (uncompressed) that I had
selected was:
uncompressed image size: 184,832 bytes
Our RLE compressor went to work on the image, and managed to compress it
by a bit over 50% -- that's not very good as far as RLE goes, but the
reason is that it was a photographic portion of my desktop picture that
has very little repeating runs in it (unlike most parts of the Mac UI
such as windows, menus, buttons, documents, etc.). Still, a savings of
almost 50% per frame is definitely worth doing, and this is about as
worst-case as it'll get for typical video capture:
RLE compressed key frame: 110,007 bytes (it's of my desktop background,
not too many runs)
header: 2,184 bytes (includes the CLUT)
512 byte align filler: 449 bytes
total: 112,640 bytes
After this first frame, I just let the computer sit for a bit -- I moved
the cursor over the capture area (the cursor is overlaid over the video
later, it isn't compressed with the video frame), but the contents of the
area didn't change. Because of this, and the differential compression,
the size of each frame being written to disk was positively tiny:
each differential frame: 2,432 bytes
header: 128 bytes
512 byte align filler: 0 bytes
total: 2,560 bytes (weird that it's 512 aligned, eh?)
Of course, if parts of the capture area changed, the differential frames
would be slightly larger, but not by much, unless it was a big change.
Typical computer screen scenes work like that: a small changes to the
area that happen over time. This kind of capturing algorithm is perfect
for that.
As you can probably guess, there is a world of difference in terms of CPU
lag between writing out a 184,832 byte image and a 2,560 byte image 10
times per second! That's a savings of over 72x the hard drive space and
the associated overhead of writing it out.
.....
If you have any questions, please feel free to email me at:
andrew@AmbrosiaSW.com
+--------------------------+-----------------------------------+
| Andrew Welch | Ambrosia Software, Inc. |
| Thaumaturgist | http://www.AmbrosiaSW.com/ |
+--------------------------+-----------------------------------+