1

I have about 30 gigabytes of video (mostly MP4, some MKV and webm) I need to transcode to 8-bit VP9 with Free Lossless Audio Codec (FLAC) audio in an MKV container from various input codecs (AAC audio; H264, VP8, H265/HEVC, and probably some other video codecs). On my most powerful system, transcoding low-resolution videos takes twice as long as the length of the video. I use ffmpeg on Linux with the arguments ffmpeg -i input -c:v libvpx-vp9 -lossless 1 -c:a FLAC -preset veryslow output.mkv to transcode videos without hardware assistance. Recently, however, a friend of mine got an Intel i5 Kaby Lake CPU for his PC, and has offered to transcode the videos for me. According to Wikipedia and its references the new Kaby Lake CPUs support hardware decoding of all my input codecs and encoding of 8-bit VP9. So I have two questions:

  1. What ffmpeg arguments can my friend use to transcode the videos to VP9 and audio to FLAC in an MKV container? Do they work with Windows? If not, that is fine as he has a Windows 10-Linux dual-boot.

  2. Is the veryslow preset still necessary to get best compression?

I've tried to find the answer to this question elsewhere but could only find examples for encoding codecs like H264 and JPEG.

2 Answers 2

3

UPDATE ON 3 AUGUST 2017: According to a newer answer by user 林正浩, ffmpeg now has support for VP9 encoding through VAAPI. I still don't have the hardware required to test this though so my answer will be of limited help. I'll leave my original answer on how to encode VP9 in software below.


For some reason FFmpeg doesn't support VP9 encoding on Intel's QuickSync hardware encoder, even though they support H.264 and HEVC. A search through the FFmpeg source code repository shows it's not even a matter of it being disabled, the feature just hasn't been implemented yet. But if it does become available at some point in the future, it should be usable in a manner similar to the other QuickSync encoders: a switch like -c:v vp9_qsv instead of -c:v libvpx-vp9 should do the job.

FFmpeg command line usage is the same on all platforms, with the one notable exception I know of being Windows users having to use NUL instead of /dev/null for output during the first pass of a 2-pass encode. But since you're doing 1-pass and lossless this shouldn't affect you.

If you want to speed up your encodes the most obvious thing you should try is setting an encoding speed value with the -speed switch. Recommended values are numbers from 0 to 4, with 0 being really, really slow (think -preset placebo in x264 but worse) but high quality and 4 being fast while being lower quality. ffmpeg uses -speed 1 by default which is a good speed-for-quality tradeoff for lossy encoding. However, I just did a quick lossless encoding test with different speed values and noticed a 32% reduction in file size when going from -speed 1 to -speed 0 with lossless encoding. The encoding time tripled though, so whether using 0 is worth it is up to you. The file produced by -speed 4 was only 1.1% larger than the one produced by -speed 1 though, and it was encoded 43% faster. So I'd say that if you're doing lossless and -speed 0 is too slow you might as well use -speed 4.

Another important encoding performance increase is turning on multi-threading with the -threads switch; libvpx doesn't automatically use multiple threads so this must be set manually by the user. You should also set the number of tile columns with the -tile-columns switch. This option makes libvpx divide the video into multiple tiles and encode these tiles in parallel for better multi-threading. You can find recommended numbers for the amount of tile columns and threads in the "Tiling and Threading Recommendations" section of Google's VP9 encoding guide. As you can see, the number of threads used goes up with the number of tiles, which means that depending on the number of CPU cores available your processor might not be fully saturated while encoding sub-HD-resolution video. If you mainly encode low-resolution videos you might want to consider encoding multiple files at the same time.

However, there is yet another way to speed up VP9 encoding: multi-threading within a single column tile that can by turned on with -row mt 1. As of April 4 (2017, hello future people), it isn't part of a released version of libvpx but will most likely be in libvpx 1.6.2. If you want to try it out before the next release you need to compile recent git versions of libvpx and ffmpeg from source. Just follow FFmpeg's compilation guide for your distro of choice but instead of downloading and extracting a release tarball do git pull https://chromium.googlesource.com/webm/libvpx instead.

As for the veryslow preset, that's only used in x264 and x265. libvpx uses the -speed switch and additionally the -quality best, -quality good, or -quality realtime options to define how much time the encoder is allowed to spend encoding a frame. The default is -quality good because -quality best is so slow it's unusable and -quality realtime is meant to be used for time-critical applications like video calls and livestreaming.

6
  • Great writeup, and welcome to Super User!
    – slhck
    Commented Apr 4, 2017 at 8:08
  • Thanks for the answer. What FFmpeg arguments can I use to encode VP9 with VA-API? I saw that the FFmpeg source has a file named 'vaapi_vp9.c'. I will try what you suggested above to speed up the process.
    – Billy
    Commented Apr 5, 2017 at 21:16
  • VA-API can be used to provide both hardware accelerated encoding and decoding, and in this case it's only decoding unfortunately. A less known fact about FFmpeg is that it actually has video playback functionality as well; Google Chrome and the Linux version of Firefox use it to play video and audio. The binary FFmpeg provides for playback is ffplay. (Also, I corrected a mistake in my comment: the quality switches are actually -quality good, -quality realtime etc, not just -good or -realtime.)
    – veikk0
    Commented Apr 5, 2017 at 23:04
  • Actually, I take that back. There are news articles that indicate VP9 encode support being present in VA-API but I can't find anything definite and I can't be asked to dig around source code right now. You should probably take a look here and here. I'd start with checking if FFmpeg has been built with --enable-vaapi and if yes, running ffmpeg -decoders | grep vaapi and ffmpeg -h encoder=<encodername>. I have an AMD CPU so I can't test this myself.
    – veikk0
    Commented Apr 5, 2017 at 23:49
  • 1
    @user13178 Please see the below answer – it is now possible to build ffmpeg with hardware-based VP9 encoding support. You may want to edit your post.
    – slhck
    Commented Jun 30, 2017 at 16:04
5

As of today, it is possible to build FFmpeg with VAAPI, which, on supported systems, allows you to encode VP9 on the Intel Integrated GPU.

The new encoder, when ffmpeg is compiled with VAAPI support, is called vp9_vaapi.

To see available options to use when tuning the encoder, run:

ffmpeg -hide-banner -h encoder=vp9_vaapi

Output:

Encoder vp9_vaapi [VP9 (VAAPI)]:
    General capabilities: delay 
    Threading capabilities: none
    Supported pixel formats: vaapi_vld
vp9_vaapi AVOptions:
  -loop_filter_level <int>        E..V.... Loop filter level (from 0 to 63) (default 16)
  -loop_filter_sharpness <int>        E..V.... Loop filter sharpness (from 0 to 15) (default 4)

What happens when you try to pull this off on unsupported hardware, say Skylake?

See the sample output below:

[Parsed_format_0 @ 0x42cb500] compat: called with args=[nv12]
[Parsed_format_0 @ 0x42cb500] Setting 'pix_fmts' to value 'nv12'
[Parsed_scale_vaapi_2 @ 0x42cc300] Setting 'w' to value '1920'
[Parsed_scale_vaapi_2 @ 0x42cc300] Setting 'h' to value '1080'
[graph 0 input from stream 0:0 @ 0x42cce00] Setting 'video_size' to value '3840x2026'
[graph 0 input from stream 0:0 @ 0x42cce00] Setting 'pix_fmt' to value '0'
[graph 0 input from stream 0:0 @ 0x42cce00] Setting 'time_base' to value '1/1000'
[graph 0 input from stream 0:0 @ 0x42cce00] Setting 'pixel_aspect' to value '1/1'
[graph 0 input from stream 0:0 @ 0x42cce00] Setting 'sws_param' to value 'flags=2'
[graph 0 input from stream 0:0 @ 0x42cce00] Setting 'frame_rate' to value '24000/1001'
[graph 0 input from stream 0:0 @ 0x42cce00] w:3840 h:2026 pixfmt:yuv420p tb:1/1000 fr:24000/1001 sar:1/1 sws_param:flags=2
[format @ 0x42cba40] compat: called with args=[vaapi_vld]
[format @ 0x42cba40] Setting 'pix_fmts' to value 'vaapi_vld'
[auto_scaler_0 @ 0x42cd580] Setting 'flags' to value 'bicubic'
[auto_scaler_0 @ 0x42cd580] w:iw h:ih flags:'bicubic' interl:0
[Parsed_format_0 @ 0x42cb500] auto-inserting filter 'auto_scaler_0' between the filter 'graph 0 input from stream 0:0' and the filter 'Parsed_format_0'
[AVFilterGraph @ 0x42ca360] query_formats: 6 queried, 4 merged, 1 already done, 0 delayed
[auto_scaler_0 @ 0x42cd580] w:3840 h:2026 fmt:yuv420p sar:1/1 -> w:3840 h:2026 fmt:nv12 sar:1/1 flags:0x4
[hwupload @ 0x42cbcc0] Surface format is nv12.
[AVHWFramesContext @ 0x42ccbc0] Created surface 0x4000000.
[AVHWFramesContext @ 0x42ccbc0] Direct mapping possible.
[AVHWFramesContext @ 0x42c3e40] Created surface 0x4000001.
[AVHWFramesContext @ 0x42c3e40] Direct mapping possible.
[AVHWFramesContext @ 0x42c3e40] Created surface 0x4000002.
[AVHWFramesContext @ 0x42c3e40] Created surface 0x4000003.
[AVHWFramesContext @ 0x42c3e40] Created surface 0x4000004.
[AVHWFramesContext @ 0x42c3e40] Created surface 0x4000005.
[AVHWFramesContext @ 0x42c3e40] Created surface 0x4000006.
[AVHWFramesContext @ 0x42c3e40] Created surface 0x4000007.
[AVHWFramesContext @ 0x42c3e40] Created surface 0x4000008.
[AVHWFramesContext @ 0x42c3e40] Created surface 0x4000009.
[AVHWFramesContext @ 0x42c3e40] Created surface 0x400000a.
[vp9_vaapi @ 0x409da40] Encoding entrypoint not found (19 / 6).
Error initializing output stream 0:0 -- Error while opening encoder for output stream #0:0 - maybe incorrect parameters such as bit_rate, rate, width or height
[AVIOContext @ 0x40fdac0] Statistics: 0 seeks, 0 writeouts
[aac @ 0x40fcb00] Qavg: -nan
[AVIOContext @ 0x409f820] Statistics: 32768 bytes read, 0 seeks
Conversion failed!

The interesting bits are the entrypoint warnings for VP9 encoding being absent on this particular platform, as confirmed by vainfo's output:

libva info: VA-API version 0.40.0
libva info: va_getDriverName() returns 0
libva info: Trying to open /usr/local/lib/dri/i965_drv_video.so
libva info: Found init function __vaDriverInit_0_40
libva info: va_openDriver() returns 0
vainfo: VA-API version: 0.40 (libva 1.7.3)
vainfo: Driver version: Intel i965 driver for Intel(R) Skylake - 1.8.4.pre1 (glk-alpha-71-gc3110dc)
vainfo: Supported profile and entrypoints
      VAProfileMPEG2Simple            : VAEntrypointVLD
      VAProfileMPEG2Simple            : VAEntrypointEncSlice
      VAProfileMPEG2Main              : VAEntrypointVLD
      VAProfileMPEG2Main              : VAEntrypointEncSlice
      VAProfileH264ConstrainedBaseline: VAEntrypointVLD
      VAProfileH264ConstrainedBaseline: VAEntrypointEncSlice
      VAProfileH264ConstrainedBaseline: VAEntrypointEncSliceLP
      VAProfileH264Main               : VAEntrypointVLD
      VAProfileH264Main               : VAEntrypointEncSlice
      VAProfileH264Main               : VAEntrypointEncSliceLP
      VAProfileH264High               : VAEntrypointVLD
      VAProfileH264High               : VAEntrypointEncSlice
      VAProfileH264High               : VAEntrypointEncSliceLP
      VAProfileH264MultiviewHigh      : VAEntrypointVLD
      VAProfileH264MultiviewHigh      : VAEntrypointEncSlice
      VAProfileH264StereoHigh         : VAEntrypointVLD
      VAProfileH264StereoHigh         : VAEntrypointEncSlice
      VAProfileVC1Simple              : VAEntrypointVLD
      VAProfileVC1Main                : VAEntrypointVLD
      VAProfileVC1Advanced            : VAEntrypointVLD
      VAProfileNone                   : VAEntrypointVideoProc
      VAProfileJPEGBaseline           : VAEntrypointVLD
      VAProfileJPEGBaseline           : VAEntrypointEncPicture
      VAProfileVP8Version0_3          : VAEntrypointVLD
      VAProfileVP8Version0_3          : VAEntrypointEncSlice
      VAProfileHEVCMain               : VAEntrypointVLD
      VAProfileHEVCMain               : VAEntrypointEncSlice
      VAProfileVP9Profile0            : VAEntrypointVLD

The VLD (for Variable Length Decode) entry point for VP9 profile 0 is the furthest that Skylake comes to in terms of VP9 hardware-acceleration.

These with Kabylake test beds, run these encode tests and report back :-)

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .