I hope this question isn't too 'soft' for here. It's been a while $\tiny{\text{an eternity for some people's standards}}$ since I've touched this stuff, and I had a convincing explanation to this question eight minutes ago but must have misplaced it, so please help me find it again:
Most classical sorting algorithms one reads about are comparison sorts, in which a single comparison produces a binary outcome (let's break ties arbitrarily). The motivation for this I remember is that they are general, requiring only that an ordering be prescribed over the items, and insensitive to how the items are encoded, but suffer from a $\Omega(n \log n)$-time barrier simply for reason of needing to do enough comparisons to distinguish the $n!$ possibilities. One then meets algorithms such as radix sort, which is supposed to beat this under caveats such as the space of items being discrete and bounded, and assuming the ability to retrieve any decimal/etc. place in constant time. But as I recently thought about it, this operation is essentially a $b$-way switch as opposed to the (ordinary) $2$-way kind used in comparison sorts. Then radix sort would be subject to the same lower bound on time taken since the foregoing gives an improvement of $(\log b)/(\log 2)$ (at most?). So how is radix sort ever preferable to one of the standard comparison sorts? I know this is a contradiction, but not how so.
In my experience this kind of paradox typically dissolves when formalised. I would rather have an intuitive reason that explains it away if such is available, though.