Multi-faceted Compression
Aug 1, 2004 12:00 PM, By Jeff Sauer
CodecSys draws upon multiple codecs.
This article contains tables only available in PDF format. To view it, you must have the Adobe Acrobat Reader, which can be downloaded for free.
Format wars may make for good copy in the press, but users tend to hate them. Mostly, they lead to confusion and awkward choices, and typical users would be happy to avoid them all together. That has been true with tape formats for many years, and now, in the digital era, it is true with file formats as well. Of course, a lot of companies try to end format confusion, but usually with their own allegedly vastly superior product. Alas, those major breakthroughs don't come around very often.
One company, Broadcast International, has a different approach. Its new technology, CodecSys, might help eliminate video compression format confusion. CodecSys isn't some new compression format, but rather it leverages the best aspects of existing compression formats from a variety of developers. It achieves its best results by working with the strengths of each while still keeping awkward format decisions away from the user.
CodecSys, “the Multi-Codec System,” chooses from whatever codecs it has available in the system. It picks whichever one delivers the best quality for specific content material at a given bit rate. And here's the kicker: It can use them all within a single video file.
CodecSys is an encoding engine that can switch compression formats on the fly, scene to scene, or even frame to frame if that makes compression more efficient. CodecSys has a decoder that's equally agile at smoothly switching compression formats during realtime playback.
If you've worked much with video compression or tested a few MPEG encoding cards side by side, you probably know that individual coding formats, even individual cards, often have specific strengths and weaknesses. For example, DCT-based compression formats like M-JPEG and the various MPEG flavors do an efficient job on some video material, but intricate details, titles and graphics with lots of sharp lines, and landscapes with a random barrage of colors tend to give DCTs trouble. Wavelet compression tends to squeeze more gracefully, but it has inherent softness at lower bit rates.
The idea behind CodecSys is to use each compression format's strength while avoiding the weaknesses by switching to whatever codec works best for any specific content. The CodecSys compression engine, which cannot yet encode in realtime (about 3X today), performs a series of sample encodes on each scene or dramatic change in content before deciding which format is most efficient. By choosing from a menu of compression formats for any scene material, it will theoretically always be more efficient than any single codec.
Best of all for independent codec developers who have long fought a difficult business model battle against free compression formats, CodecSys can track how much each format is used. With that information, license fees can be allocated on a use basis, treating any codec developer fairly. At present, Broadcast International operates CodecSys as a service, but the engine should soon be available in product form, although the details are somewhat unclear as yet.
Man vs. machine?
As wonderful as CodecSys may sound in theory, how can a computer, no matter how sophisticated, accurately make good decisions about what looks best to the human eye? Compressing video well can be an art, done best by trained professionals with eyes attuned to problem scene material and compression-related artifacts. In extreme cases, compressionists might encode clips or sections of clips several times with different settings to achieve the best looking video at a target bandwidth. Or they may experiment to see just how low they can go without the visual quality falling apart. But decisions are always based on some human idea of visual quality: what looks good or at least acceptable.
CodecSys attempts to do that through a variety of algorithms, beginning with simple comparison analyses of change and noise added. Peak signal-to-noise ratio (PSNR) comparisons are the first level of analysis, although there are others. Root mean square error (RMSE) is a way to measure the effectiveness of averaging. A straight signal-to-noise (SNR) and just noticeable difference (JND) image-quality analyses are also used.
Yet all of those metrics are calculations that try to place value judgments on the very subjective determination of quality. Experienced compressionists know that the best compression, especially low bit rate compression, isn't always the cleanest, most error-free, or best averaged. Mild blurring, for example, adds noise, but it also smoothes hard edges to eliminate overt and awkward compression artifacts, often to more preferable results. Can a computer ever do a better job than a trained eye?
In a way, this brute processor-strength method of judging picture quality may seem a lot like IBM's Big Blue computer playing chess against a grandmaster like Gary Kasparov. Chess is a very logical game with clear rules and seemingly clear cause-and-effect dynamics, so processing power would seem invaluable. And, against most modest chess opponents, a powerful computer's if-then analysis capability would win almost all of the time. In fact, Big Blue actually beat Kasparov a couple of years ago, but that came after more than a decade of trying. The simple law of averages would have the machine win eventually.
But at the highest levels of chess, the game is less about brute strength than it is about seeing patterns, recognizing similarities to past game situations, and developing new strategies on the fly that often focus on more subtle goals than straight victory or the simple point value of a given piece. Kasparov has beaten Big Blue a lot more times than he has lost to it.
Of course, machines like waveform monitors and vectrascopes are invaluable tools for judging video quality. Yet the trained eye is the ultimate judge of compression quality, often balancing one type of artifact against another less objectionable one and judging acceptable levels of chroma sub-sampling and quantizing.
Eye of the beholden
So, how can Broadcast International claim to be better than the human eye when it comes to judgments of aesthetic quality? Ultimately, it doesn't. The Broadcast International engineers with whom I've spoken would probably bet on the video equivalent of Kasparov and the trained eyes of experienced compressionists when it comes to making ultimate decisions on quality. But how many organizations have a video Kasparov? For most organizations doing compression, the better analogy might be Big Blue against that mere mortal chess player and, like Big Blue against you and me, CodecSys will do pretty well.
Even that's probably not the right analogy. CodecSys targets businesses and organizations that compress and distribute hours of video content over close networks, even over satellites where every bit of data means dollars. Very little of that compression is done in any sort of a hands-on, eyes-alert manner. Ultimately, CodecSys isn't trying to beat Kasparov's or anyone else's eyes. It really claims only to beat other compression systems that are limited to a single compression format. That's where CodecSys should always gain efficiency.
If CodecSys were, for example, to go up against the new hot compression format H. 264, it could do no worse than break even because it could also use H.264 to compress an entire stream. Because CodecSys already has multiple implementations of the H.264 codec in its repertoire, however, it would always have the chance to use either if one performed better on a given scene. That alone could put CodecSys ahead, but it still would have plenty more codecs to choose on any scene depending on the type of content.
CodecSys is ultimately only as good as its parts, and currently Broadcast International lists just a handful of codecs. It does include a couple different versions of H.264, as well as versions of MPEG-4, -2, and -1, H.261, H.263, M-JPEG, VP3, VSS, several versions of Divx, several Xvid, and other proprietary codecs licensed from individual developers, including wavelet-based compression. However, Broadcast International has not yet licensed well-known formats like Windows Media or Real, although they would like to do so. For now, those are potentially serious omissions.
Still, CodecSys is an open platform. It's the codec-switching technology that is the secret sauce, and ultimately, any current or future codec can be added to the system to make it more efficient, including WM9S, Real, and Sorenson. Additionally, as better techniques emerge that allow machines to better predict visual quality, they can be easily incorporated as well.
Is it 10 percent more efficient than H.264? Or 5 percent better than WM9S? Unfortunately, it's hard to put numbers to a system like CodecSys, especially in these early stages of its public debut, because it is so highly dependent on scene content and available codecs. Yet, there is little doubt that from a theoretical perspective at least, two heads are better than one. What CodecSys is trying to do is put all the compression developing heads out there together in one package, all working for you.
Best of all, if CodecSys gains a critical mass of compression formats and technologies, it might relieve you of ever having to choose among compression formats again.
feedback
To comment on this article, email the Video Systems editorial staff at vsfeedback@primediabusiness.com.
Continue the discussion on “Crosstalk” the Millimeter Forum.


Multimedia
Blogs
Forum
Affordable HD
Whitepapers
Advertisers
Blogcast
Millimeter






