The question has confused between the image formed by the lens and the projection formed on the film.
In this diagram, segment PQ is the object, and P'Q' is the image focused, which conforms to the magnification formula in the question. However, for the picture created on the film, it is the segment R1R2, a blurred defocused area. Here PT is the cheif ray (defined as the ray that passes the center of the lens element) so T is the center of the bokeh ball, and this is the location of the projected object. The two marginal rays define the edge of the bokeh ball.
We see
$$\triangle PQO\sim \triangle TUO\implies \frac{h}{s}=-\frac{\overline{TU}}{D}\implies\overline{TU}=-\frac{hD}{s}$$
Indeed it's the same formula for the projection for a pinhole camera. The focal length does not appear here.
For casual photographers I think this argument is enough.
===
To make it more interesting, @MichaelC has mentioned one can place a pupil here. So let us position a physical aperture stop behind the lens. The entrane pupil is defined as the image of the aperture formed by the lens. We can have these cases:
aperture is placed within f; EP is a virtual image behind it (most common):
aperture is placed outside f; EP is on other side, and object is between the lens and EP
aperture is placed outside f; EP is on other side, and object is in front of EP
To convert between the EP and the aperture (exit pupil)
$$\frac{1}{d}+\frac{1}{\delta}=\frac{1}{f}$$
By using the appropriate sign convention of d, the three cases can be treated in the same way.
First I would like to get the properties of the cheif ray at the center of the EP. Using geometry:
$$\theta'\approx\tan\theta'=-\frac{h}{s-d}$$
(Note if I do this with the marginal ray - I get the location of the edge of the blur which is used to calculate the circle of confusion / depth of field; That's not the purpose of this question, so I'll pass)
Then pretending the light ray originates from that location, I can trace the ray until it hits the film. ABCD matrix is convenient so I'll use it to simplify things. It can be viewed as the cheif ray going through the air, then the lens, then the air again to hit the film. So:
$$\begin{bmatrix}1&D\\0&1\end{bmatrix}\begin{bmatrix}1&0\\-\frac{1}{f}&1\end{bmatrix}\begin{bmatrix}1&d\\0&1\end{bmatrix}\begin{bmatrix}0\\\frac{h}{d-s}\end{bmatrix}=\begin{bmatrix}\frac{hDf-hdD+hdf}{fd-fs}\\\frac{hd-hf}{fs-fd}\end{bmatrix}$$
hence the location of the projection is
$$\overline{TU}=\frac{hDf-hdD+hdf}{fd-fs}$$
which has a little of everything including the location of the film and the EP, and the focal length.
If I decrease the distance between PE and the lens, i.e. setting d -> 0 we obtain:
$$\overline{TU}|_{d=0}=-\frac{hD}{s}$$
So the dependency on f has disappeared.
To verify the result, setting d -> infinity (i.e. placing the aperture at the rear focal point), we get an object-space telecentric lens that removes perspective distortion (no dependency on s):
$$\overline{TU}|_{d\rightarrow\infty}=\frac{h(f-D)}{f}$$
Setting d -> 0 (i.e. placing the aperture at the front focal point), we get an image-space telecentric lens (no dependency on D):
$$\overline{TU}|_{d=f}=\frac{hf}{f-s}$$