1

Passing CFLAGS and CXXFLAGS to a HandBrake build for the latest version (v1.3.3 at the time of this writing) will work until you add -flto which will FAIL the whole build.

How to build HandBrake with LTO option -flto and as a stretch goal, with FDO as well (feedback directed optimisation aka FDO aka PGO)?

Most of the codecs within HandBrake are developed with "hand-coded" assembly, so many assert that the compiler optimisation gains would not be that much.
I would like to test and challenge that assertion!

2 Answers 2

1

EDIT 01/08/2021... All of the Below was done against Handbrake v1.3.3. See my newer answer for Handbrake v1.4.0

I answered an issue in GitHub similar to the question I asked and thought the answer would better serve the public with similar issues here at stackexchange, instead of being burried in a github issue ticket.... https://github.com/HandBrake/HandBrake/issues/1072#issuecomment-865630524

Also the observed benefits would serve those willing to go through the effort as well and save them a lot of encode/transcode time. They can benchmark the effort after accomplishing it to prove the assertion.

Much of the procedure was deduced and experimented on from the notes described here... https://github.com/griff/HandBrake/blob/master/doc/BUILD-Linux

As described in the link above, using CFLAGS/CXXFLAGS is not recommended to steer the compilation or build. It is recommended to use the built-in configuration mechanism to set the gcc flags.

HOW?

Handbrake is just a front-end to a LOT of "crontrib". To see how each contrib module is built, you can leverage the "make" reports for each contrib in the build or destination directory, before making them.

To get a build directory, you would need to do an initial configuration via...

$  ./configure --build=build --optimize=speed

if you haven't got one.

MAKE REPORTS

e.g. Let's say you're building HandBrake in a folder called "build" (like the value in the configure command above), then:

$  cd ./build
$  make report.help
  AVAILABLE MAKEFILE VARS REPORTS
  ----------------------------------------------------------------
  report.main            global general vars
  report.gcc             global gcc vars (inherited by module GCC)
  report.var             usage: make report.var name=VARNAME
  x265.report            X265-scoped vars
  x265_8.report          X265_8-scoped vars
  x265_10.report         X265_10-scoped vars
  x265_12.report         X265_12-scoped vars
  libdav1d.report        LIBDAV1D-scoped vars
  ffmpeg.report          FFMPEG-scoped vars
  libdvdread.report      LIBDVDREAD-scoped vars
  libdvdnav.report       LIBDVDNAV-scoped vars
  libbluray.report       LIBBLURAY-scoped vars
  nvenc.report           NVENC-scoped vars
  libhb.report           LIBHB-scoped vars
  test.report            TEST-scoped vars
  gtk.report             GTK-scoped vars
  pkg.report             PKG-scoped vars

On each line, first column above, you'll see each report. you can then access the reports by

$  make <report_name>

Where you replace <report_name> with the report you want.

It is important to note, there's a hierarchy and inheritance to the above even within each report.

report.gcc

can be taken as the root for gcc flags.

In my case, I chose to configure the build using "speed" previously...

$  ./configure --build=build --optimize=speed

Which maps to

GCC.args.O.speed

in the report.gcc

Another important key in that report is

GCC.args.extra

which basically 'may' append extra compiler option flags after the former. As you know with gcc, if there's a conflict between options, the last one is used. Since we can't tell easily enough if the many modules are using one or the other or both, I tend to ensure whatever is in the first, is also in the latter. But the latter can contain more! You can see the defaults by checking the report.

You can override the above by creating a text file configuration called "custom.defs" in the root of the handbrake source folder (if you git cloned it, then the top folder of HandBrake where you basically do your git pull commands).

/HandBrake$ ls -h
AUTHORS.markdown  CODE_OF_CONDUCT.md  CONTRIBUTING.md  download  gtk      macosx         pkg              scripts      THANKS.markdown
build             configure           COPYING          gccFDO    libhb    make           preset           SECURITY.md  TRANSLATION.markdown
build2            contrib             custom.defs    graphics  LICENSE  NEWS.markdown  README.markdown  test         win

FDO (aka PGO)

I do FDO (feedback-directed optimisation aka FDO aka PGO - Profile Guided Optimisation) in mine so I usually build first with custom.defs defined as

$ cat custom.defs 
GCC.args.O.speed = -march=native -O3 -pipe -fprofile-generate=../gccFDO -fprofile-update=atomic
GCC.args.extra = -mfpmath=sse -march=native -O3 -pipe -fprofile-generate=../gccFDO -fprofile-update=atomic

Then run HandBrake transcoding several videos with varying different codecs, filters, and settings; for a couple of days to generate profiles. Then I use the generated profiles by using...

$ cat custom.defs 
GCC.args.O.speed = -march=native -O3 -pipe -fprofile-use=../gccFDO -fprofile-correction -fprofile-partial-training
GCC.args.extra = -mfpmath=sse -march=native -O3 -pipe -fprofile-use=../gccFDO -fprofile-correction -fprofile-partial-training

on a brand new build directory. Good usual suspects for profiling would be the typical source types to your typical target encode type. My typical target type is x265_10bit with AAC audio:

  1. From x264 to x265_10bit
  2. From x265 to x265_10bit
  3. From the various forms of AC3 to the typical AAC you use
  4. From various forms of DTS to the typical AAC you use
  5. Any typical pre-processing, filtering, denoising, etc that you use.

As you can imagine, depending on your hardware, this could take a while! My profiling took a week!

You can fine tune the compiler flags and optimisation for each module by using the reporting process I described above for each module and overriding the keys by quoting them in the custom_defs file with the values you want, just like the example above for the GCC.args.* defaults.

For all of the above to work, remember not to have exported CFLAGS or CXXFLAGS. You can check what flags you have setup in your bash session by:

$  export -p | grep FLAGS

LTO + FDO:

Link Time Optimisation LTO with FDO are excellent together as can be easilly researched on google for many programs and benchmarks.

Unfortunately when setting LTO as the default in GCC.args.* using -flto or setting LTO for the FFMPEG module; fails the whole build. That's a boolean 'or'. It will fail on one or the other or both!

LTO can be added however to all other modules!

This is my custom.defs...

$ cat custom.defs
GCC.args.O.speed = -march=native -O3 -pipe -fprofile-use=../gccFDO -fprofile-correction -fprofile-partial-training
GCC.args.extra = -mfpmath=sse -march=native -O3 -pipe -fprofile-use=../gccFDO -fprofile-correction -fprofile-partial-training
X265.GCC.args.O.speed = -march=native -O3 -pipe -flto -fprofile-use=../gccFDO -fprofile-correction -fprofile-partial-training
X265.GCC.args.extra = -mfpmath=sse -march=native -O3 -pipe -flto -fprofile-use=../gccFDO -fprofile-correction -fprofile-partial-training
X265_8.GCC.args.O.speed = -march=native -O3 -pipe -flto -fprofile-use=../gccFDO -fprofile-correction -fprofile-partial-training
X265_8.GCC.args.extra = -mfpmath=sse -march=native -O3 -pipe -flto -fprofile-use=../gccFDO -fprofile-correction -fprofile-partial-training
X265_10.GCC.args.O.speed = -march=native -O3 -pipe -flto -fprofile-use=../gccFDO -fprofile-correction -fprofile-partial-training
X265_10.GCC.args.extra = -mfpmath=sse -march=native -O3 -pipe -flto -fprofile-use=../gccFDO -fprofile-correction -fprofile-partial-training
X265_12.GCC.args.O.speed = -march=native -O3 -pipe -flto -fprofile-use=../gccFDO -fprofile-correction -fprofile-partial-training
X265_12.GCC.args.extra = -mfpmath=sse -march=native -O3 -pipe -flto -fprofile-use=../gccFDO -fprofile-correction -fprofile-partial-training
LIBHB.GCC.args.O.speed = -march=native -O3 -pipe -flto -fprofile-use=../gccFDO -fprofile-correction -fprofile-partial-training
LIBHB.GCC.args.extra = -mfpmath=sse -march=native -O3 -pipe -flto -fprofile-use=../gccFDO -fprofile-correction -fprofile-partial-training
LIBDAV1D.GCC.args.O.speed = -march=native -O3 -pipe -flto -fprofile-use=../gccFDO -fprofile-correction -fprofile-partial-training
LIBDAV1D.GCC.args.extra = -mfpmath=sse -march=native -O3 -pipe -flto -fprofile-use=../gccFDO -fprofile-correction -fprofile-partial-training
GTK.GCC.args.O.speed = -march=native -O3 -pipe -flto -fprofile-use=../gccFDO -fprofile-correction -fprofile-partial-training
GTK.GCC.args.extra = -mfpmath=sse -march=native -O3 -pipe -flto -fprofile-use=../gccFDO -fprofile-correction -fprofile-partial-training
LIBDVDREAD.GCC.args.O.speed = -march=native -O3 -pipe -flto -fprofile-use=../gccFDO -fprofile-correction -fprofile-partial-training
LIBDVDREAD.GCC.args.extra = -mfpmath=sse -march=native -O3 -pipe -flto -fprofile-use=../gccFDO -fprofile-correction -fprofile-partial-training
LIBDVDNAV.GCC.args.O.speed = -march=native -O3 -pipe -flto -fprofile-use=../gccFDO -fprofile-correction -fprofile-partial-training
LIBDVDNAV.GCC.args.extra = -mfpmath=sse -march=native -O3 -pipe -flto -fprofile-use=../gccFDO -fprofile-correction -fprofile-partial-training
LIBBLURAY.GCC.args.O.speed = -march=native -O3 -pipe -flto -fprofile-use=../gccFDO -fprofile-correction -fprofile-partial-training
LIBBLURAY.GCC.args.extra = -mfpmath=sse -march=native -O3 -pipe -flto -fprofile-use=../gccFDO -fprofile-correction -fprofile-partial-training
TEST.GCC.args.O.speed = -march=native -O3 -pipe -flto -fprofile-use=../gccFDO -fprofile-correction -fprofile-partial-training
TEST.GCC.args.extra = -mfpmath=sse -march=native -O3 -pipe -flto -fprofile-use=../gccFDO -fprofile-correction -fprofile-partial-training
NVENC.GCC.args.O.speed = -march=native -O3 -pipe -flto -fprofile-use=../gccFDO -fprofile-correction -fprofile-partial-training
NVENC.GCC.args.extra = -mfpmath=sse -march=native -O3 -pipe -flto -fprofile-use=../gccFDO -fprofile-correction -fprofile-partial-training

EDIT 01/08/2021... All of the Above was done against Handbrake v1.3.3.

The above process failed for me for the v1.4.0 Please see my other answer for v1.4.0.

1
  • 1
    Note that some of your speedup may be coming from -march=native, as well as LTO + FDO. IvyBridge only has AVX1 (and no BMI1 / BMI2), but it does have popcnt. Baseline Ubuntu repo builds probably don't use any -m options, so CPU features beyond the SSE2 baseline for x86-64 only get used by runtime dispatching (e.g. inside x265), not across the board in C code, including any filters that can benefit from auto-vectorization with new instructions up to SSE4 (and their AVX1 3-operand versions). Also the -mtune=native implied by -march=native probably helps a bit. Commented Jul 5, 2021 at 3:54
0

I have re-tried against the latest tag version which is Handbrake v1.4.0 in both GCC-11 and CLANG-12. The configuration required needed to change a bit to get successful builds. The GCC-11 build for example could not successfully build for certain modules as it couldn't resolve the path to the profile files after training (gcda files in an absolute path).

Underneath are training and FDO configurations for both GCC-11 and CLANG-12 agains v1.4.0 and are different to the previous answer's process which was for Handbrake v1.3.3.

GCC-11:

Configure and Build command for GCC-11:

./configure --harden --optimize=speed --enable-fdk-aac --disable-nvenc --build=build-v1.4.0 && cd ./build-v1.4.0 && time make -j$(( $(nproc) + 1 ));

TRAINING/PROFILING STAGE HANDBRAKE V1.4.0 --> GCC-11 custom.defs file:

GCC.args.O.speed = -march=native -msse2avx -O3 -pipe -fprofile-generate -fprofile-update=atomic
GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -msse2avx -O3 -pipe -fprofile-generate -fprofile-update=atomic
X265.GCC.args.O.speed = -march=native -msse2avx -O3 -pipe -fprofile-generate -fprofile-update=atomic -flto
X265.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -msse2avx -O3 -pipe -fprofile-generate -fprofile-update=atomic -flto
X265_8.GCC.args.O.speed = -march=native -msse2avx -O3 -pipe -fprofile-generate -fprofile-update=atomic -flto
X265_8.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -msse2avx -O3 -pipe -fprofile-generate -fprofile-update=atomic -flto
X265_10.GCC.args.O.speed = -march=native -msse2avx -O3 -pipe -fprofile-generate -fprofile-update=atomic -flto
X265_10.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -msse2avx -O3 -pipe -fprofile-generate -fprofile-update=atomic -flto
X265_12.GCC.args.O.speed = -march=native -msse2avx -O3 -pipe -fprofile-generate -fprofile-update=atomic -flto
X265_12.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -msse2avx -O3 -pipe -fprofile-generate -fprofile-update=atomic -flto
LIBHB.GCC.args.O.speed = -march=native -msse2avx -O3 -pipe -fprofile-generate -fprofile-update=atomic -flto
LIBHB.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -msse2avx -O3 -pipe -fprofile-generate -fprofile-update=atomic -flto
LIBDAV1D.GCC.args.O.speed = -march=native -msse2avx -O3 -pipe -fprofile-generate -fprofile-update=atomic -flto
LIBDAV1D.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -msse2avx -O3 -pipe -fprofile-generate -fprofile-update=atomic -flto
GTK.GCC.args.O.speed = -march=native -msse2avx -O3 -pipe -fprofile-generate -fprofile-update=atomic -flto
GTK.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -msse2avx -O3 -pipe -fprofile-generate -fprofile-update=atomic -flto
LIBDVDREAD.GCC.args.O.speed = -march=native -msse2avx -O3 -pipe -fprofile-generate -fprofile-update=atomic -flto
LIBDVDREAD.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -msse2avx -O3 -pipe -fprofile-generate -fprofile-update=atomic -flto
LIBDVDNAV.GCC.args.O.speed = -march=native -msse2avx -O3 -pipe -fprofile-generate -fprofile-update=atomic -flto
LIBDVDNAV.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -msse2avx -O3 -pipe -fprofile-generate -fprofile-update=atomic -flto
LIBBLURAY.GCC.args.O.speed = -march=native -msse2avx -O3 -pipe -fprofile-generate -fprofile-update=atomic -flto
LIBBLURAY.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -msse2avx -O3 -pipe -fprofile-generate -fprofile-update=atomic -flto
TEST.GCC.args.O.speed = -march=native -msse2avx -O3 -pipe -fprofile-generate -fprofile-update=atomic -flto
TEST.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -msse2avx -O3 -pipe -fprofile-generate -fprofile-update=atomic -flto
FDKAAC.GCC.args.O.speed = -march=native -msse2avx -O3 -pipe -fprofile-generate -fprofile-update=atomic -flto
FDKAAC.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -msse2avx -O3 -pipe -fprofile-generate -fprofile-update=atomic -flto
ZIMG.GCC.args.O.speed = -march=native -msse2avx -O3 -pipe -fprofile-generate -fprofile-update=atomic -flto
ZIMG.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -msse2avx -O3 -pipe -fprofile-generate -fprofile-update=atomic -flto
LIBDAV1D.GCC.args.O.speed = -march=native -msse2avx -O3 -pipe -fprofile-generate -fprofile-update=atomic -flto
LIBDAV1D.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -msse2avx -O3 -pipe -fprofile-generate -fprofile-update=atomic -flto

FDO STAGE HANDBRAKE V1.4.0 --> GCC-11 custom.defs file:

GCC.args.O.speed = -march=native -msse2avx -O3 -pipe -fprofile-use -fprofile-correction -fprofile-partial-training
GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -msse2avx -O3 -pipe -fprofile-use -fprofile-correction -fprofile-partial-training
X265.GCC.args.O.speed = -march=native -msse2avx -O3 -pipe -fprofile-use -fprofile-correction -fprofile-partial-training -flto
X265.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -msse2avx -O3 -pipe -fprofile-use -fprofile-correction -fprofile-partial-training -flto
X265_8.GCC.args.O.speed = -march=native -msse2avx -O3 -pipe -fprofile-use -fprofile-correction -fprofile-partial-training -flto
X265_8.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -msse2avx -O3 -pipe -fprofile-use -fprofile-correction -fprofile-partial-training -flto
X265_10.GCC.args.O.speed = -march=native -msse2avx -O3 -pipe -fprofile-use -fprofile-correction -fprofile-partial-training -flto
X265_10.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -msse2avx -O3 -pipe -fprofile-use -fprofile-correction -fprofile-partial-training -flto
X265_12.GCC.args.O.speed = -march=native -msse2avx -O3 -pipe -fprofile-use -fprofile-correction -fprofile-partial-training -flto
X265_12.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -msse2avx -O3 -pipe -fprofile-use -fprofile-correction -fprofile-partial-training -flto
LIBHB.GCC.args.O.speed = -march=native -msse2avx -O3 -pipe -fprofile-use -fprofile-correction -fprofile-partial-training -flto
LIBHB.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -msse2avx -O3 -pipe -fprofile-use -fprofile-correction -fprofile-partial-training -flto
LIBDAV1D.GCC.args.O.speed = -march=native -msse2avx -O3 -pipe -fprofile-use -fprofile-correction -fprofile-partial-training -flto
LIBDAV1D.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -msse2avx -O3 -pipe -fprofile-use -fprofile-correction -fprofile-partial-training -flto
GTK.GCC.args.O.speed = -march=native -msse2avx -O3 -pipe -fprofile-use -fprofile-correction -fprofile-partial-training -flto
GTK.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -msse2avx -O3 -pipe -fprofile-use -fprofile-correction -fprofile-partial-training -flto
LIBDVDREAD.GCC.args.O.speed = -march=native -msse2avx -O3 -pipe -fprofile-use -fprofile-correction -fprofile-partial-training -flto
LIBDVDREAD.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -msse2avx -O3 -pipe -fprofile-use -fprofile-correction -fprofile-partial-training -flto
LIBDVDNAV.GCC.args.O.speed = -march=native -msse2avx -O3 -pipe -fprofile-use -fprofile-correction -fprofile-partial-training -flto
LIBDVDNAV.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -msse2avx -O3 -pipe -fprofile-use -fprofile-correction -fprofile-partial-training -flto
LIBBLURAY.GCC.args.O.speed = -march=native -msse2avx -O3 -pipe -fprofile-use -fprofile-correction -fprofile-partial-training -flto
LIBBLURAY.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -msse2avx -O3 -pipe -fprofile-use -fprofile-correction -fprofile-partial-training -flto
TEST.GCC.args.O.speed = -march=native -msse2avx -O3 -pipe -fprofile-use -fprofile-correction -fprofile-partial-training -flto
TEST.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -msse2avx -O3 -pipe -fprofile-use -fprofile-correction -fprofile-partial-training -flto
FDKAAC.GCC.args.O.speed = -march=native -msse2avx -O3 -pipe -fprofile-use -fprofile-correction -fprofile-partial-training -flto
FDKAAC.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -msse2avx -O3 -pipe -fprofile-use -fprofile-correction -fprofile-partial-training -flto
ZIMG.GCC.args.O.speed = -march=native -msse2avx -O3 -pipe -fprofile-use -fprofile-correction -fprofile-partial-training -flto
ZIMG.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -msse2avx -O3 -pipe -fprofile-use -fprofile-correction -fprofile-partial-training -flto
LIBDAV1D.GCC.args.O.speed = -march=native -msse2avx -O3 -pipe -fprofile-use -fprofile-correction -fprofile-partial-training -flto
LIBDAV1D.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -msse2avx -O3 -pipe -fprofile-use -fprofile-correction -fprofile-partial-training -flto

LLVM-12/CLANG-12/LLD-12:

Clang does PGO a little differently to GCC. What is evident is that there were no problems with resolving absolute paths for modules when using Clang/LLVM/LLD compared to GCC and associated default tools. However Clang has an extra merge step required for merging raw profile files required for FDO.

Thus there are 3 steps:

  1. Training/Profile Stage
  2. Merge raw profile data
  3. FDO Stage

Step commands in detail. The custom.defs files for steps 1 and 3 are listed after the three steps seperately underneath. This section is purely to state the commands required per step and not the custom.defs. Thus you will need to ensure that the custom.defs are in place before running the configure and build commads:

  1. Configure and Build command for LLVM-12/CLANG-12/LLD-12:
LDFLAGS="-fuse-ld=lld" ./configure --ar /usr/bin/llvm-ar --ranlib /usr/bin/llvm-ranlib --strip /usr/bin/llvm-strip --cc /usr/bin/clang --optimize=speed --enable-fdk-aac --disable-nvenc --build=build-v1.4.0-CLANG && cd ./build-v1.4.0-CLANG && time LDFLAGS="-fuse-ld=lld" make -j$(( $(nproc) + 1 ));

After building, train/profile per normal like with GCC or if you attempted the earlier instructions for v1.3.3.

  1. After training/profiling, merge raw profile data. Replace the and paths with the correct locations for your build.
llvm-profdata merge -output=<Absolute-Path>/handbrake.profdata <Absolute-Path-To-Profile-Files>/default_*.profraw
  1. FDO Build, this is the exact same one-line command as step 1. The differenc is in the custom.defs file.
LDFLAGS="-fuse-ld=lld" ./configure --ar /usr/bin/llvm-ar --ranlib /usr/bin/llvm-ranlib --strip /usr/bin/llvm-strip --cc /usr/bin/clang --optimize=speed --enable-fdk-aac --disable-nvenc --build=build-v1.4.0-CLANG && cd ./build-v1.4.0-CLANG && time LDFLAGS="-fuse-ld=lld" make -j$(( $(nproc) + 1 ));

TRAINING/PROFILING STAGE HANDBRAKE V1.4.0 --> LLVM-12/CLANG-12/LLD-12 custom.defs file:

Remember to replace <Absolute-Path-To-Profile-Files> with the correct absolute path.

GCC.args.O.speed = -march=native -O3 -pipe -fprofile-instr-generate=<Absolute-Path-To-Profile-Files>/default_%m.profraw -fprofile-update=atomic
GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -O3 -pipe -fprofile-instr-generate=<Absolute-Path-To-Profile-Files>/default_%m.profraw -fprofile-update=atomic
X265.GCC.args.O.speed = -march=native -O3 -pipe -fprofile-instr-generate=<Absolute-Path-To-Profile-Files>/default_%m.profraw -fprofile-update=atomic -flto=thin
X265.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -O3 -pipe -fprofile-instr-generate=<Absolute-Path-To-Profile-Files>/default_%m.profraw -fprofile-update=atomic -flto=thin
X265_8.GCC.args.O.speed = -march=native -O3 -pipe -fprofile-instr-generate=<Absolute-Path-To-Profile-Files>/default_%m.profraw -fprofile-update=atomic -flto=thin
X265_8.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -O3 -pipe -fprofile-instr-generate=<Absolute-Path-To-Profile-Files>/default_%m.profraw -fprofile-update=atomic -flto=thin
X265_10.GCC.args.O.speed = -march=native -O3 -pipe -fprofile-instr-generate=<Absolute-Path-To-Profile-Files>/default_%m.profraw -fprofile-update=atomic -flto=thin
X265_10.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -O3 -pipe -fprofile-instr-generate=<Absolute-Path-To-Profile-Files>/default_%m.profraw -fprofile-update=atomic -flto=thin
X265_12.GCC.args.O.speed = -march=native -O3 -pipe -fprofile-instr-generate=<Absolute-Path-To-Profile-Files>/default_%m.profraw -fprofile-update=atomic -flto=thin
X265_12.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -O3 -pipe -fprofile-instr-generate=<Absolute-Path-To-Profile-Files>/default_%m.profraw -fprofile-update=atomic -flto=thin
LIBHB.GCC.args.O.speed = -march=native -O3 -pipe -fprofile-instr-generate=<Absolute-Path-To-Profile-Files>/default_%m.profraw -fprofile-update=atomic -flto=thin
LIBHB.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -O3 -pipe -fprofile-instr-generate=<Absolute-Path-To-Profile-Files>/default_%m.profraw -fprofile-update=atomic -flto=thin
LIBDAV1D.GCC.args.O.speed = -march=native -O3 -pipe -fprofile-instr-generate=<Absolute-Path-To-Profile-Files>/default_%m.profraw -fprofile-update=atomic -flto=thin
LIBDAV1D.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -O3 -pipe -fprofile-instr-generate=<Absolute-Path-To-Profile-Files>/default_%m.profraw -fprofile-update=atomic -flto=thin
GTK.GCC.args.O.speed = -march=native -O3 -pipe -fprofile-instr-generate=<Absolute-Path-To-Profile-Files>/default_%m.profraw -fprofile-update=atomic -flto=thin
GTK.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -O3 -pipe -fprofile-instr-generate=<Absolute-Path-To-Profile-Files>/default_%m.profraw -fprofile-update=atomic -flto=thin
LIBDVDREAD.GCC.args.O.speed = -march=native -O3 -pipe -fprofile-instr-generate=<Absolute-Path-To-Profile-Files>/default_%m.profraw -fprofile-update=atomic -flto=thin
LIBDVDREAD.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -O3 -pipe -fprofile-instr-generate=<Absolute-Path-To-Profile-Files>/default_%m.profraw -fprofile-update=atomic -flto=thin
LIBDVDNAV.GCC.args.O.speed = -march=native -O3 -pipe -fprofile-instr-generate=<Absolute-Path-To-Profile-Files>/default_%m.profraw -fprofile-update=atomic -flto=thin
LIBDVDNAV.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -O3 -pipe -fprofile-instr-generate=<Absolute-Path-To-Profile-Files>/default_%m.profraw -fprofile-update=atomic -flto=thin
LIBBLURAY.GCC.args.O.speed = -march=native -O3 -pipe -fprofile-instr-generate=<Absolute-Path-To-Profile-Files>/default_%m.profraw -fprofile-update=atomic -flto=thin
LIBBLURAY.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -O3 -pipe -fprofile-instr-generate=<Absolute-Path-To-Profile-Files>/default_%m.profraw -fprofile-update=atomic -flto=thin
TEST.GCC.args.O.speed = -march=native -O3 -pipe -fprofile-instr-generate=<Absolute-Path-To-Profile-Files>/default_%m.profraw -fprofile-update=atomic -flto=thin
TEST.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -O3 -pipe -fprofile-instr-generate=<Absolute-Path-To-Profile-Files>/default_%m.profraw -fprofile-update=atomic -flto=thin
FDKAAC.GCC.args.O.speed = -march=native -O3 -pipe -fprofile-instr-generate=<Absolute-Path-To-Profile-Files>/default_%m.profraw -fprofile-update=atomic -flto=thin
FDKAAC.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -O3 -pipe -fprofile-instr-generate=<Absolute-Path-To-Profile-Files>/default_%m.profraw -fprofile-update=atomic -flto=thin
ZIMG.GCC.args.O.speed = -march=native -O3 -pipe -fprofile-instr-generate=<Absolute-Path-To-Profile-Files>/default_%m.profraw -fprofile-update=atomic -flto=thin
ZIMG.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -O3 -pipe -fprofile-instr-generate=<Absolute-Path-To-Profile-Files>/default_%m.profraw -fprofile-update=atomic -flto=thin
LIBDAV1D.GCC.args.O.speed = -march=native -O3 -pipe -fprofile-instr-generate=<Absolute-Path-To-Profile-Files>/default_%m.profraw -fprofile-update=atomic -flto=thin
LIBDAV1D.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -O3 -pipe -fprofile-instr-generate=<Absolute-Path-To-Profile-Files>/default_%m.profraw -fprofile-update=atomic -flto=thin

FDO STAGE HANDBRAKE V1.4.0 --> LLVM-12/CLANG-12/LLD-12 custom.defs file:

Remember to replace <Absolute-Path-To-Merged-Profile> with the correct absolute path.

GCC.args.O.speed = -march=native -O3 -pipe -fprofile-instr-use=<Absolute-Path-To-Merged-Profile>/handbrake.profdata
GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -O3 -pipe -fprofile-instr-use=<Absolute-Path-To-Merged-Profile>/handbrake.profdata
X265.GCC.args.O.speed = -march=native -O3 -pipe -fprofile-instr-use=<Absolute-Path-To-Merged-Profile>/handbrake.profdata -flto=thin
X265.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -O3 -pipe -fprofile-instr-use=<Absolute-Path-To-Merged-Profile>/handbrake.profdata -flto=thin
X265_8.GCC.args.O.speed = -march=native -O3 -pipe -fprofile-instr-use=<Absolute-Path-To-Merged-Profile>/handbrake.profdata -flto=thin
X265_8.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -O3 -pipe -fprofile-instr-use=<Absolute-Path-To-Merged-Profile>/handbrake.profdata -flto=thin
X265_10.GCC.args.O.speed = -march=native -O3 -pipe -fprofile-instr-use=<Absolute-Path-To-Merged-Profile>/handbrake.profdata -flto=thin
X265_10.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -O3 -pipe -fprofile-instr-use=<Absolute-Path-To-Merged-Profile>/handbrake.profdata -flto=thin
X265_12.GCC.args.O.speed = -march=native -O3 -pipe -fprofile-instr-use=<Absolute-Path-To-Merged-Profile>/handbrake.profdata -flto=thin
X265_12.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -O3 -pipe -fprofile-instr-use=<Absolute-Path-To-Merged-Profile>/handbrake.profdata -flto=thin
LIBHB.GCC.args.O.speed = -march=native -O3 -pipe -fprofile-instr-use=<Absolute-Path-To-Merged-Profile>/handbrake.profdata -flto=thin
LIBHB.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -O3 -pipe -fprofile-instr-use=<Absolute-Path-To-Merged-Profile>/handbrake.profdata -flto=thin
LIBDAV1D.GCC.args.O.speed = -march=native -O3 -pipe -fprofile-instr-use=<Absolute-Path-To-Merged-Profile>/handbrake.profdata -flto=thin
LIBDAV1D.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -O3 -pipe -fprofile-instr-use=<Absolute-Path-To-Merged-Profile>/handbrake.profdata -flto=thin
GTK.GCC.args.O.speed = -march=native -O3 -pipe -fprofile-instr-use=<Absolute-Path-To-Merged-Profile>/handbrake.profdata -flto=thin
GTK.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -O3 -pipe -fprofile-instr-use=<Absolute-Path-To-Merged-Profile>/handbrake.profdata -flto=thin
LIBDVDREAD.GCC.args.O.speed = -march=native -O3 -pipe -fprofile-instr-use=<Absolute-Path-To-Merged-Profile>/handbrake.profdata -flto=thin
LIBDVDREAD.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -O3 -pipe -fprofile-instr-use=<Absolute-Path-To-Merged-Profile>/handbrake.profdata -flto=thin
LIBDVDNAV.GCC.args.O.speed = -march=native -O3 -pipe -fprofile-instr-use=<Absolute-Path-To-Merged-Profile>/handbrake.profdata -flto=thin
LIBDVDNAV.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -O3 -pipe -fprofile-instr-use=<Absolute-Path-To-Merged-Profile>/handbrake.profdata -flto=thin
LIBBLURAY.GCC.args.O.speed = -march=native -O3 -pipe -fprofile-instr-use=<Absolute-Path-To-Merged-Profile>/handbrake.profdata -flto=thin
LIBBLURAY.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -O3 -pipe -fprofile-instr-use=<Absolute-Path-To-Merged-Profile>/handbrake.profdata -flto=thin
TEST.GCC.args.O.speed = -march=native -O3 -pipe -fprofile-instr-use=<Absolute-Path-To-Merged-Profile>/handbrake.profdata -flto=thin
TEST.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -O3 -pipe -fprofile-instr-use=<Absolute-Path-To-Merged-Profile>/handbrake.profdata -flto=thin
FDKAAC.GCC.args.O.speed = -march=native -O3 -pipe -fprofile-instr-use=<Absolute-Path-To-Merged-Profile>/handbrake.profdata -flto=thin
FDKAAC.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -O3 -pipe -fprofile-instr-use=<Absolute-Path-To-Merged-Profile>/handbrake.profdata -flto=thin
ZIMG.GCC.args.O.speed = -march=native -O3 -pipe -fprofile-instr-use=<Absolute-Path-To-Merged-Profile>/handbrake.profdata -flto=thin
ZIMG.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -O3 -pipe -fprofile-instr-use=<Absolute-Path-To-Merged-Profile>/handbrake.profdata -flto=thin
LIBDAV1D.GCC.args.O.speed = -march=native -O3 -pipe -fprofile-instr-use=<Absolute-Path-To-Merged-Profile>/handbrake.profdata -flto=thin
LIBDAV1D.GCC.args.extra = -mfpmath=sse -mavx -fstack-protector-strong -D_FORTIFY_SOURCE=2 -march=native -O3 -pipe -fprofile-instr-use=<Absolute-Path-To-Merged-Profile>/handbrake.profdata -flto=thin

Well there you go, you now can build Handbrake v1.4.0 with PGO+LTO against either GCC or LLVM/CLANG/LLD. Feel free to choose which ever of the two tickles your fancy or benchmark to your hearts content! :-)

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .