1

During compilation optimization this question about best guesses for gnu gcc compiler parameter --param l2-cache-size=? (default is number for kB block size)
came up. What are recommendable values for cpu core cluster with shared L2 cache, cpu clusters with united L2 cache on single MMU (memory management unit) and cpu cores with private L2 cache for each single core?

Getting L2 cache sizes¹ on Linux distributions can be done by console commands like:
cat /sys/devices/system/cpu/cpu0/cache/index2/size (for cpu0 on multicore system, for example)
find /sys/devices/system/cpu/*/cache/index*/size -print -exec cat {} \;
lshw | grep -B 11 -C 11 -e level=2
dmidecode -t cache
lstopo-no-graphics

Is this specific enough for requesting support at gcc mailing list or is there more widespread documentation available for L2 cache related compiler flags?

Thx

1) short summary of How to find the L2 cache size in Linux?

1
  • If gcc guesses a wrong default cache size (L1I, L1D, L2ID, L3ID, L4ID) it might help knowing this --param l2-cache-size, btw: additional cores vs. cache sizes and cache misses superuser.com/a/317785/981508
    – beyondtime
    Commented Jul 7, 2019 at 17:08

1 Answer 1

4

This question has been raised in bug-report
Bug 87444: 'gcc -marc=native' sets L2 cache size equal to L3 cache size on i7 and i5 CPU.

It seems like the parameter l2-cache-size does not necessarily refer to the L2 cache. It is actually dynamically selected by gcc to be the same as either the L3 cache size and is only equal to the L2 cache if there is no L3 cache.

A developer remarks that it should have really been renamed to last-level-cache. So if there is an L4 cache, gcc will use that size.

I don't think that it is a good idea to change this parameter. But in case you do, it is remarked that this parameter is only used for some small heuristics and is seemed by the tone of the comment to be relatively unimportant.

2
  • Thx @harrymc: Would you suggest a cpu/gpu/npu benchmark (from source compilable tool?) setting for clearifying on individual hardware (arch, fs, software) situations? (corrected after timeout) ... lowering l2-cache-size would most probably increase cache-misses (maybe there's some comparison on that available, even if cost of compiling effort is higher than resulting binaries performance gain) ... or something to save energy instead of knowing in detail?
    – beyondtime
    Commented Jul 7, 2019 at 16:22
  • The best source for benchmarks is PassMark.
    – harrymc
    Commented Jan 18, 2022 at 8:17

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .