From Reviews in Computational Chemistry Vol. 5, for $\ce{H2O}$ using HF, MP2, CCSD, and CCSD(T), all lengths in angstroms:
$$ \small \begin{array}{rcccc} \text{Basis} & \text{HF} & \text{MP2} & \text{CCSD} & \text{CCSD(T)} \\ \hline \text{STO-3G} & 0.989 & 1.013 & 1.028 & 1.028 \\ \text{3-21G} & 0.967 & 0.989 & 0.993 & 0.994 \\ \text{DZ} & 0.951 & 0.979 & 0.979 & 0.980 \\ \text{6-31G*} & 0.947 & 0.969 & 0.969 & 0.971 \\ \text{DZP} & 0.944 & 0.963 & 0.961 & 0.962 \\ \text{TZ2P} & 0.941 & 0.958 & 0.956 & 0.959 \\ \text{Experimental} & 0.957 \end{array} $$
In case one is skeptical due to effects from the lone pairs and the changing bond angle, here are results for $\ce{BH}$ using Hartree-Fock:
$$ \small \begin{array}{rc} \text{Basis} & r_{e} \\ \hline \text{cc-pVDZ} & 1.235950 \\ \text{cc-pVTZ} & 1.222062 \\ \text{cc-pVQZ} & 1.220306 \\ \text{cc-pV5Z} & 1.220010 \\ \text{cc-pV6Z} & 1.219877 \end{array} $$
and for $\ce{LiH}$ using Hartree-Fock:
$$ \small \begin{array}{rc} \text{Basis} & r_{e} \\ \hline \text{cc-pVDZ} & 1.619017 \\ \text{cc-pVTZ} & 1.607865 \\ \text{cc-pVQZ} & 1.605530 \\ \text{cc-pV5Z} & 1.605519 \end{array} $$
Going on the NIST CCCBDB and looking at calculated geometries, I cannot find a counterexample, at least when considering only HF, as adding dynamic correlation introduces an effect to counterbalance this bond shortening, since electrons can then properly repel each other to levels beyond the mean-field approximation, and DFT offers no insight beyond HF here.
One potential reason that doesn't receive much attention is intramolecular BSSE, where one part of a molecule borrows the basis functions from another. However, in the case of my calculations, I would argue that only cc-pVDZ could be considered a "deficient" basis, and the trend is still seen when leaving it out. I also believe that this is more plausible for molecules with distinct "domains" that can interact with each other, like one amino acid in a peptide forming a hydrogen bond with another AA, etc.
The current rationale I have is that as the basis set becomes more complete, and functions with higher angular momentum (polarization functions) are included, the overlap between the atomic wave functions is increased. This increased overlap results in higher electron density in the bonding region, which strengthens it, causing it to become shorter, like going from ethane to ethene to ethylene.
Specifically, I'm looking for a literature reference where an explanation is given for the phenomenon, not just examples with or without comparison to experiment.