A Premise. The physical realization of the device must follow as closely as possible the structure identified by mathematical model behind it. Otherwise, second order effects which are considered to be negligible may prompt out with unexpected/undesirable results.
Said that I can proceed and answer the question(s).
Q. Wouldn't it be possible to just form a long shallow \$n\$-region at the surface using a single implantation and then connect one end as source and the opposite end as drain?
A. As it is noted in the OP, (junction) field effect structures of this type are constructively (or better intrinsically, meaning that you cannot get rid of them) present in any field effect device, their control pin being called the bulk B: see for example this Q&A on how this terminal is connected in field effect devices. Moreover, this structure works exactly as an expected, provided the gate and other terminals are properly biased, but there are several reasons which prevent semiconductor technologist to produce JFET devices in that form: possibly the most important are listed below.
The bulk region is lightly doped respect to the channel one, and this means that the junction depletion layer extends more inside such type of gate than in the channel of the device. This means that controlling channel conduction by reducing/enhancing its width requires larger voltage variations respect to a structure where the gate has a higher doping respect to the channel, and this means lower forward transconductance \$g_m\$ an thus lower gain. Thus even if the geometry is (almost) the same, you get a worser JFET device.
Due to the same low doping, the bulk gate ha constructively a non negligible series gate extrinsic resistance \$r_G\$. From the equivalent circuit point of view, the result is the one shown below.
From this it can be inferred that this resistance has two bad effects on the overall device performance. First, it rises the gate equivalent noise voltage in the following way
$$
e_{n_\text{tot}}=\sqrt{e_{n_i}^2+e_{n_{r_G}}^2},
$$
and second it creates a high frequency pole that limits the high frequency response of the JFET. Again, you get a worser JFET device.
simulate this circuit – Schematic created using CircuitLab
- The gate area \$A_G\$ affects with inverse proportionality the gate flicker (\$1/f\$) intensity noise voltage \$e_{1/f}\$: if you give up the top diffused gate, you get a device with lower gate area and thus higher flicker/burst noise. In sum, you get again a worser JFET device.
Q. The depletion would form only from the bottom, but it should work as a JFET regardless, no?
A. Yes and no. As said above, a single diffuse structure works as a JFET device provided its bulk contact is operated as a gate electrode. Nevertheless, leaving the upper side of the channel without a properly biased gate region causes a subtle problem due to the fact that the surface of the device cannot be leaved as is (or more correctly said, will not rest a simple semiconductor surface).
The device must be protected from chemical contaminants, thus it is necessary to build a \$\mathrm{SiO_2}\$ passivating layer on the top of the wafer, and when there isn't a gate region below this layer causes at least two different problems
The first one is a higher leakage current of the JFET channel, since the structure is now a JFET connected in parallel with a MOSFET without a proper gate bias (effectively without a gate). This situation is well described in the following picture (taken from the third printing of the High Speed Transistor Switching Handbook, edited by William D. Roher, Motorola 1963), which shows this effect and a countermeasure for a planar PNP BJT.
The second one is the classical SOI floating body effect, that causes a serious distortion of the device characteristics as shown by the orange and violet lines in this picture. In turn, you get again a worser JFET device.