Background
In general, when placing decoupling capacitors in parallel, their capacitances add and their compound ESR is reduced (like for parallel resistors). But I am a bit uncertain if/how this applies to their inductance, which is the most crucial aspect in high frequency decoupling.
The inductance of the decoupling current is usually given by its loop, set up by the delimiting planes/vias/capacitor leads. The following scheme is usually shown to understand the inductance of a decoupling capacitor (image from p.17 here):
So far so good. Now the same document also shows a suggested lateral placement of several capacitors on p.16:
This document is of course not alone with these suggestions. It is a general practice that I have seen many times. Maybe what makes this reference slightly more relevant is that it doesn't focus so much on ripple current capacity, but really on the loop inductance of the input capacitance.
Question
If several (\$N\$) capacitors are paralleled, then - regardless of how many - the first image above always holds, i.e. the entire decoupling current still has to travel around the entire blue loop. The document is right to note that the current in Layer 1 and Layer 2 is opposite creating low inductance (aka: a small loop area), but the current through all the capacitors is still in parallel, so there is no flux cancellation from paralleling those capacitors and hence no inductance reduction. Correct?
Another related aspect: When I place the capacitors with much greater mutual separation (probably not practical), the first figure (cross-section) still holds, but their magnetic field coupling is reduced. In that case, one could argue that each capacitor has the loop inductance given by the blue loop in the image. But because there are now several independent such loops their impedances are in parallel. That argument would suggest that inductance is reduced to \$L_\text{loop}/N\$.
So is it actually clever to try to scatter the capacitance over as wide an area an possible, while still of course minimizing the individual loops as a primary goal?