Video compression, codecs and transcoding
Uwe Steinmueller | Video Capture | Published Aug 31, 2011
Video compression: your friend and enemy
Video compression is clearly our friend because without a lot of compression we would have a very hard time handling the massive data we get from a 1080p video stream. Think of two mega pixels per frame at 24, 30 or 60 frames per second (translates to 48, 60 or even 120 mega pixels data per second). On the flip side, video compression reduces the possible image quality we can get. It is good to better understand how we deal with the implications of compressed data. It is kind of like the difference between Raw and JPEG images for still cameras, though with video the compression is a lot stronger. That said, it's important to consider that when we watch a movie, the moving image is rarely analyzed as critically as still images
Overall video compression is about the trade-off between:
- Data volume
- Data storage needs
- Data processing speed (in camera, on computer)
- Image quality
Video Compression Details
Because the data is digital some non-destructive algorithms are also used, but we don't need to worry about them because they have no influence on the image quality.
Here are the different techniques commonly used to get the video stream down to a manageable data volume:
- Detail Compression
This is more like JPEG where the algorithms try to preserve the overall impression of the image but sacrifice finer detaill.
- Bit depth:
8-14 bit. All prosumer video cameras and HDSLRs create video streams at 8 bit, which is kind of like shooting 8 bit JPEGs with stronger compression. And just like when you shoot stills, if you have more than 8 bits per color available then you have more latitude during color corrections (called also 'grading'). At 10 bit there are 1024 shades of each color channel, as opposed to only 256 at 8 bit.
- Chroma Subsampling
'Because the human visual system is less sensitive to the position and motion of color than luminance, bandwidth can be optimized by storing more luminance detail than color detail. At normal viewing distances, there is no perceptible loss incurred by sampling the color detail at a lower rate. In video systems, this is achieved through the use of color difference components. The signal is divided into a luma (Y') component and two color difference components (chroma)' (from this Wikipedia article).
The ideal is 4:4:4 sampling, but this often produces files that are too big for consumer use. Pro cameras often use 4:2:2 and most HDSLRs have only 4:2:0. I am told that experienced moviemakers can easily see the difference between 4:2:2 and 4:2:0 in a direct comparison. This likely is again mostly an issue during editing if you make major color corrections.
- Group of Pictures (GOP)
This is a very complex video compression method. It stores the video in I and P Frames (there are also D and B frames, which allow more efficient encoding). The idea is to have an I-Frame that is like a normal full defined image (of course compressed by the other methods) and then have P-Frames that are not fully images on their own. P-Frames store delta information to create a new frame starting from the I-Frame or previously re-created frames (Wikipedia entry).
|GOP compression structure shoowing one I and two P frames per group of pictures.|
Here is the catch:
- Lets assume you have a static scene with no content movement or change. The P-Frame would just tell: nothing changed and hardly use up data
- If there is fast movement that results in a lot of changes:
- The real-time video encoder will be challenged and may even skip frames
- The P-Frames will get complex
- Quality can suffer
Longer GOP structures (e.g. 15 long) are more efficient but can be also more problematic because the full I-frames many frames apart.
GOP sequences are very powerful to reduce data but also may show limitations for some fast action. Overall the results are quite impressive. Think that your Blu-ray disks use the same basic methods but also use much more sophisticated encoding software then the camera's real-time encoders.
- Bit Rate
The video system (camera or player) has to deal with the capabilities of the devices. A video stream needs to read/write constantly for the entire movie or clip. If the video camera writes to a SD- or CF-Card it should not write faster than the card can perform constantly (depends of course on the specifications of the card and camera). That is why the encoding system in the camera will maintain a maximal bit-rate (e.g. 17 Mega bits per second, 17Mbit/s). If you ever looked at your JPEG images you will find that the files vary in size. Images with more detail or more noise will show bigger. The video encoding system has to ensure that the bit rate never exceeds the limit. This can be done by using more aggressive compression if otherwise the data are too large to maintain the bit rate.
Clearly higher bit rates result in better image quality but also create more data and need faster devices (e.g. cards).
- Camera Encoders
At a given bit rate different cameras can create different video quality. The camera encoder for H.264 has to fulfill a complex task with low powered processors in real time. It is to be expected that these encoders (part of the cameras processing chip) will improve over time because better algorithms are found and also the in camera processors get more powerful. Larger cameras can have the advantage to have faster processors because the camera housing also allows better cooling. This means that the bit rate alone does not tell the full story about the final image quality.
Codecs are software that enable your devices (Camera, Editor, Player) to perform video encoding and/or decoding (think video compression and de-compression). Obviously your Blu-ray player needs only to decode a video stream. All these Codecs are based on standards (often de-facto company standards).
All video streams are embedded in multimedia (video) containers. It is a common misconception that .AVI (Windows) and .MOV (Quicktime on Macs) already define what kind of video format it is. These containers include information which Codec is used in the file. If your system does not have the proper Codec installed the movie won't play.
H.264 is one of the most important HD Codecs used today. H.264 is part of the MPEG-4 AVC standard. H.264 is used on:
- Internet video (e.g. YouTube)
- HD video broadcasting
- Apple TV
- iPad, iPhone and more
- Video recording
Video Codecs used during the Video Production Workflow
There are mainly three steps during the production of video:
- Video Capture
- Video Editing
- Video rendering for players (e.g. Internet, streaming or Blu-ray)
Codecs used during Video Capture
We want to use a codec that captures the highest quality possible balanced with the data volume our cameras can handle. Here are some codecs used in HDSLRs and pro video cameras.
Simple Codec that is not very efficient in terms of a good balance of size and quality (non GOP Codec).
MPEG-4 AVC/ H.264
H.264 uses a GOP encoding method. This is not really a top recording format. The proper domains for H.264 are the output and player devices. This said H.264 is quite a usable compromise for getting reasonable quality in relative small files. Canon HDSLRs use this codec in a .MOV container.
Also a GOP based Codec. Used by Panasonic GH1/2, GF1, AF-100, TM700 and also the Sony NEX-5 and VG10. It is very important to check the bit rates used. The GH1 uses by default 17 Mbit/s and this leads to artifacts called 'Mud' (severe loss of detail in low contrast areas). Going up to 24 Mbit/s already improves the quality (used in VG10, GH2 and AF100).
Actually AVCHD (standard by Sony/Pansonic and also used on some consumer video cameras by Canon) at it's core uses the same H.264 Codec. The main difference is the packaging (container) of the data.
|AVCHD Structure on card/disk|
AVCHD is a format created for consumer camcorders. The structure actually mimics the basic structure of Blu-ray. Some TV sets and Blu-ray players can play these clips directly from the SD-Cards. For editing only the .MTS files in the STREAM folder are needed. They actually contain a H.264 video stream (some tools like ClipWrap on Macs can re-wrap these streams to .MOV without changing any data).
On Windows the MTS files are supported natively with the Windows Media Player. On Macs Quicktime does not render MTS videos. Best to use a program like VLC (free) or Toast 10 Titanium (commercial product) to play MTS clips on Macs.
Professional Recording Codecs
This standard is still widely used (e.g. DVDs). The compression is less efficient than MPEG-4 but also also less complex. MPEG-2 is often used in professional video cameras (e.g. Sony EX1 and some pro Canon camcorders).
Codec by Panasonic that only stores I-Frames (this means no GOP sequences).
- HDCAM HD
Professional Sony Codec family.
- Raw Data recording
The Red camera can record Raw data. This has many advantages in terms of image quality but also has to deal with massive data once the video clips are converted from Raw to standard Codecs. The actual data then get processed on the workstations (somehow like using Raw Converters for Stills).
- Apple ProRes HQ
This Codec is often used in editing (with FinalCut Pro) but also can be native for the new high-end Arri ALEXA digital film camera (expensive new digital movie camera).
Codecs used during Video Editing
During the editing process you want to preserve the data quality. Any quality lost during recording cannot be rescued but the data you got should not be degraded even more. GOP compression Codecs are not ideal for video editing although some editors seem to work fine with them. It is much better to use non GOP Codecs like the Apple ProRes Codec family for editing. We always transcode (read below) our clips from GOP Codecs to high quality non-GOP versions like Apple ProRes. This results in a much better handling of the video during editing but also adds an extra step in our workflow.
Transcoding is the process of converting a video stream from one Codec to a different one or just to downsample the video. Transcoding to a lower quality (higher compressed Codec) should only be done as the last step to target your output device (Internet or Blu-ray). Editing in 720p is often easier on your computer system than using full 1080p. Even if we edit at 720p we still always shoot full 1080p footage because this shows lower artifacts (like moire and aliasing) and also allows us to use full 1080p if needed.
If we record using H.264 or AVCHD in our cameras we transcode the footage to 720p Apple ProRes. This way we have a way smother editing process than using H.264 directly. We only convert part of our footage that we intend to use for editing because the file sizes get quite a bit bigger.
Tools we use for transcoding:
- MPEG Streamclip: Free video converter for Mac and Windows
- Quicktime Pro 7 (QT 7): Can be bought for Mac and Windows at nominal cost. Note: Quicktime X that comes with Macs 10.6.x is less capable and does not work for transcoding.
- QTAmateur: Free Mac and Windows tool to enable batch transcoding with QT 7
- ClipWrap: To convert MTS file to .MOV on Macs
Video Codecs for output devices
For the final rendering of your video for your target devices the H.264 compression Codec often will be used for both the Internet and Blu-ray.
- Your Blu-ray authoring tools will take care of the appropriate compression and Codec.
- Often for the Internet higher compression and some downsampling is applied.
In video compression there is no free lunch and you have to pick your poison. Overall the solutions available work very well and not everybody works in Hollywood. We follow the following workflow in our personal work:
- Import the video clips from the camera. Most cameras don't let you much choice how they record but always use the maximum quality possible if you care about video quality.
- Transcode selective clips to a high quality Codec for editing
- Edit your video using this high quality Codec.
- Export you final video to a Codec that is optimal for your output device (mostly Internet, home video or Blu-ray)
This is an edited and updated version of an article first published on Digital Outback Photo. You can read the original article here, or check out the 220 page ebook (currently on offer at $19.95) 'Mastering HD Video with your DSLR' by by Uwe Steinmueller & Helmut Kraus here.
© 2010, www.dpreview.com & Uwe Steinmueller.