Consider the NPN transistor. It is constructed of N-P-N layers, that is, roughly symmetrically.
Real parts aren't symmetric for a number of useful and practical reasons (VCE(sat), gain, breakdown voltages, speed, etc. are all affected), but let's stick with this for now and see where it takes us.
Suppose we ground one N-layer (emitter), connect positive voltage to the other (collector), and apply a bias current to the P-layer (base). As if by magic, current is drawn through the collector, at IC = IB hFE.
Note that hFE need not be constant; we're not saying this is a ratio that must hold, simply that the currents are nonzero and therefore a ratio exists. More particularly, hFE > 0 for the most part; and when it's not positive, we shall transform the circuit, so that either hFE doesn't apply, or an alternative definition applies.
If we swap around which N-layers we're biasing, we get the same thing; nothing has changed. But we have swapped terminals, so we call it "inverted operation". If we keep terminals labeled the same way, with the same current directions, we instead have IE = IB hFE(R). But for a perfectly symmetrical transistor, this is simply swapping C and E, which are identical, so we would have hFE = hFE(R).
Now, suppose the transistor is not symmetric. The forward and reverse hFEs will not match, among other things. But we still get, erm, transistance, where collector current flows as a consequence of base current. The amount varies between configurations, but not the phenomena itself -- it suffices that it's still a transistor, whether inverted or not.
Let us also consider the case for both N-layers grounded. That is, VCE = 0. In this case, regardless of any "transistance" that might happen (and indeed, since the situation is perfectly symmetrical, any induced C-E current must balance to zero), and we can draw an equivalent circuit like two diodes in parallel. If instead of hard-ground, we vary the collector voltage slightly, we unbalance the diode pair, and so for small changes (perhaps 10s of mV), and the C-E characteristic looks like a resistance.
However, if we raise the C-E voltage further (or E-C, as the case may be), the diode equivalent fails, as current reverses (i.e., hFE or hFE(R) > 0), and we observe transistor action again.
And, we can indeed understand operation as if the B-C junction were a diode stealing base current from the B-E path, sneaking it into the collector path, reducing hFE, and approaching that "two diodes in parallel" equivalent circuit. Those two diodes are always there, of course, it's just that we normally reverse-bias one. Indeed, VBE drops ever so slightly in saturation, so it's even a measurable external effect.
As you continue your studies, you will encounter the Ebers-Moll model: mind that this is only valid in the linear range (VCE > VCE(sat)), and is meaningless in reverse (that is, inverted). It's an excellent description of the phenomena of "transistance" in this condition -- but it's far from a general description, as you can see. Input values outside the intended range and you get garbage. Well, there exists a more general model -- it's harder to work with so you wouldn't want to be working problems on pad and paper for example, but for computational purposes, it's valid for any combination of terminal voltages and currents. The most common such model is the Gummel-Poon model; this is used by most SPICE simulators. Some of the parameters of this model are indeed the forward and reverse hFE (BF
and BR
in SPICE).
So, in summary: what happens at low VCE, is continuous from low negative to low positive VCE; and it can be understood as hFE falling (as VCE falls, entering forward saturation), the junctions acting like diodes shunting base current (VCE near zero, in deep saturation), then hFE(R) taking over, and rising again (as VEC rises, out of inverted saturation).