I've seen a lot of my fellow students and friend that are into electronics struggle with this, so I decided to type out a quite lengthy response - I hope people looking for similar information might find help in this.
We need to start with an understanding of what the decoupling capacitor is actually used for in the first place - many people will say that it is to filter out noise from other components. This is not the main/only reason, especially in digital circuits! If we have say a CMOS circuit, we will find that most of the current draw happens in very brief spikes on the clock. If we have no decoupling, this current will have to flow through the trace coming from the powersupply. This trace has a resistance R and inductance L. I have a schematic down lower where you can see these. When you have a sudden spike of current through this combination, we will have a large voltage drop over this trace, causing issues ranging from noise on mixed signal circuits, jitter on clocks, to complete failure of the device requiring a reset.
By placing a capacitor close to the pin, we have a local source of current during these spikes. On average, the capacitor will be charged from the powersupply. During the short current spikes, the capacitor will discharge, providing the current needed while doing so.
Now, on to the question:
It is a very common mistake to think that it's only the capacitance value that matters. The key is the combination of Equivalent Series Inductance (ESL) and capacitance. The equivalent schematic used in this post is the following:
![Basic equivalent schematic for a capacitor](https://cdn.statically.io/img/i.sstatic.net/BAmF5.png)
In this schematic, C is the rated capacitance. ESL is the equivalent inductance. This depends on the capacitor design, size, and type.
We often see something like this on our schematic:
![schematic decouplingcapacitor ideal](https://cdn.statically.io/img/i.sstatic.net/FkQOD.png)
When we place this on the pcb, it looks like this:
![PCB decoupling capacitor](https://cdn.statically.io/img/i.sstatic.net/FEqRX.png)
But when we look at the main non-idealities, the picture becomes more complex. We first need to replace our ideal capacitor with the circuit including the ESL. On top of that, we need to include the trace resistance and inductances. Our simple schematic is now:
![Withnonidealities](https://cdn.statically.io/img/i.sstatic.net/8gt9t.png)
Using a smaller capacitor with the same ESL closer to the part makes (almost - see (1)) no difference - in fact, it's performance might be worse! The reason we often want to use smaller capacitance values closer to the chip has to do with the fact that - in general - the ESL/ESR of these parts is lower (usually, we use smaller packages, as for a given family, these always have lower ESL).
The closer you can get it to the pin, the better, since the trace towards the pin also forms inductance and has a resistance. Note that it is not just the positive trace that matters - it is the inductance of the entire loop that matters - including the ground connection.
This Intersil appnote is a very good starting resource for more in depth information and figures. All figures that follow in this post are from this appnote.
http://www.intersil.com/content/dam/Intersil/documents/an13/an1325.pdf
If we factor in the impedance of a ideal capacitor, we know the impedance as a function of frequency has a \$\frac{1}{f}\$ behaviour - as we increase the frequency, the impedance decreases, and thus we get better filtering.
However, every actual capacitor also has an ESL. If we include this ESL, and we look at the impedance of the capacitor as a function of frequency, we get the following picture:
Source: Tamara Schmitz, Mike Wong, Intersil Application Note 1325: Choosing and Using Bypass Capacitors
So, what happens if we now get more capacitors? Let's say we choose a 1uF, 0.1uF and 0.01uF capacitor, in an 0805 formfactor. If we now put them together, we get the following plot:
Source: Tamara Schmitz, Mike Wong, Intersil Application Note 1325: Choosing and Using Bypass Capacitors
What we see is that even though we added smaller capacitors, we don't really get any significant benefit - the inductance causes our impedance to go up, removing all benefit of the smaller capacitors.
What we should have done is use 3 different packages, say, 0805, 0603 and 0402. This would have given us the following figure:
Source: Tamara Schmitz, Mike Wong, Intersil Application Note 1325: Choosing and Using Bypass Capacitors
(1)Ofcourse, since it's closer to the part there will be less trace inductance and resistance, but you might as well just put the bigger capacitance part closer, eliminating this difference