This document provides a description of VP9 levels, and a decoder performance test suite that targets the 420 format. It defines 14 levels for Profile 0 (8-bit) and Profile 2 (10-bit). Each level is a set of constrained bitstreams coded with targeted resolutions, frame rates, and bitrates.
A total of 14 levels are defined in terms of the following metrics:
Max number of luma samples per second. The Alt-Ref frames are taken into account, therefore this value is larger than the display luma samples per second.
The extended frame size in pixels (samples). The decoder needs to handle pixels extended outside the frame towards the right and bottom by up to 32 pixels during the inverse transform process.
The highest average bitrate over the video sequence.
The largest data size for any 4 consecutive frames, including alt-ref frames.
The smallest allowable compression ratio, mainly to prevent encoder misbehavior.
Maximum number of column tiles allowed per frame. Note that the minimum column tile width is 256 pixels (samples) and the maximum width is 4096 pixels (samples).
The minimum distance between two consecutive alter reference frames (in the unit of frames).
The maximum number of reference frame buffers that can be used.
Maximum value (in samples) of luma picture width and height.
Typical examples of frame size and display frame rate.
Level | Max Luma Sample Rate* (samples/sec) | Max Luma Picture Size (samples) | Max Bitrate (x 1000 bits/s) | Max CPB Size for Visual Layer** (1000 bits) | Min Compression Ratio | Max Tiles | Min Alt-Ref Distance | Max Reference Frames | Max Width and Height for Luma Picture (Samples) | Example Frame Size @ Display Rate (fps)*** |
---|---|---|---|---|---|---|---|---|---|---|
0 | Undefined | |||||||||
1 | 829440 | 36864 | 200 | 400 | 2 | 1 | 4 | 8 | 512 | 256×144@15 |
1.1 | 2764800 | 73728 | 800 | 1000 | 2 | 1 | 4 | 8 | 768 | 384×192@30 |
2 | 4608000 | 122880 | 1800 | 1500 | 2 | 1 | 4 | 8 | 960 | 480×256@30 |
2.1 | 9216000 | 245760 | 3600 | 2800 | 2 | 2 | 4 | 8 | 1344 | 640×384@30 |
3 | 20736000 | 552960 | 7200 | 6000 | 2 | 4 | 4 | 8 | 2048 | 1080×512@30 |
3.1 | 36864000 | 983040 | 12000 | 10000 | 2 | 4 | 4 | 8 | 2752 | 1280×768@30 |
4 | 83558400 | 2228224 | 18000 | 16000 | 4 | 4 | 4 | 8 | 4160 | 2048×1088@30 |
4.1 | 160432128 | 2228224 | 30000 | 18000 | 4 | 4 | 5 | 6 | 4160 | 2048×1088@60 |
5 | 311951360 | 8912896 | 60000 | 36000 | 6 | 8 | 6 | 4 | 8384 | 4096×2176@30 |
5.1 | 588251136 | 8912896 | 120000 | 46000 | 8 | 8 | 10 | 4 | 8384 | 4096×2176@60 |
5.2 | 1176502272 | 8912896 | 180000 | TBD | 8 | 8 | 10 | 4 | 8384 | 4096×2176@120 |
6 | 1176502272 | 35651584 | 180000 | TBD | 8 | 16 | 10 | 4 | 16832 | 8192×4352@30 |
6.1 | 2353004544 | 35651584 | 240000 | TBD | 8 | 16 | 10 | 4 | 16832 | 8192×4352@60 |
6.2 | 4706009088 | 35651584 | 480000 | TBD | 8 | 16 | 10 | 4 | 16832 | 8192×4352@120 |
* Max Luma Sample Rate is computed as the average luma samples processed
over an alternate reference frame group (ARFG). An ARFG is defined as a group
of frames that starts with an ARF and ends prior to the immediate next ARF. In
case of no ARF in the sequence, one can compute the luma sample rate over any
one (1) second window.
** Max CPB Size reflects the maximum data size over 4 frames.
*** Examples only.
A decoder test suite targeting the 420 format.
There are 6 different tests to cover various aspects of decoder behavior.
Test the situation where frame size is (32*n + 8*m) in both width and height dimensions. It exercises the decoder behavior near frame boundaries.
Stress test on decoder's ability to handle sub8x8 coded blocks. It biases towards sub8x8 modes.
Exercise the decoder's ability to use up to the maximum number (N) of reference frame buffers allowed for each level. It fulfills the N reference frame buffers first. It then selects a point where the reference frame update is temporarily suspended and the reference frame buffer is effectively frozen. The motion compensated prediction reference frames for each following coding frame are picked from the static reference frame buffer set to cover all the N frames there. Such pattern repeats every 64 frames.
Enforce minimum golden reference frame distance as a stress test for decoding speed.
Exercise the internal reference frame resizing ability. The codec switches between original frame size and a down-scaled size twice per second.
Stress test on decoder's ability to handle sub8x8 coded blocks with scaled reference frames. It biases towards the sub8x8 modes. The frame size switches between original frame size and a down-scaled size twice every second.
Test clips are encoded with a keyframe inserted every 2 seconds. The type of test and test parameter can be inferred from the name of the bitstream files (*.webm). For example, "crowd_run_256X144_fr15_bd8_gf_dist_4_l1.webm" is a bitstream that is generated by encoding the "crowd_run" clip using minimum golden reference frame distance for level 1, which is 4.
Downloads last modified 2016-05-04
The decoder should be able to decode a given bitstream and generate an MD5
that matches with the MD5 listed in the metadata. Note that --i420 --md5
should be used in decoder parameters when generating MD5.
Also see VP9 Coding Profiles.