Usually, speakers work in the frequency region above their mechanical resonance (except "woofers" in the lowest frequency part of their range, see below).
Effect A: So usually, the mass and not the spring force is dominant in the membrane movement. Therefore it is not the displacement that is proportional to the electrical signal, as you assumed, but the acceleration is proportional to the electrical signal. The displacement in that case is proportional to the electrical signal twice integrated! This also means that the displacement becomes quadratically smaller with increasing frequency, given the same electrical signal amplitude.
Effect B: So one might ask: if that is true, why don't all those speakers show a 1/f^2 downslope in their frequency response? That is because of the wave equation for sound waves. If a source of sound is small compared to the wavelength, then it acts as a "bad transmitting antenna", the same as is the case for electrical antennas. In both cases this gives a quadratic deterioration with longer wavelengths, or equivalently a quadratic increase in antenna factor with increasing frequency. This completely cancels the quadratic drop of effect A! Only for tweeters in their highest frequency range this is different, they are no longer small compared to the wavelength.
So two special cases are left:
- Woofers somewhere below 100 Hz have their resonance frequency where the mass no longer dominates over the spring force. At the resonance the two would cancel so the displacement will become very high, that's why damping is needed. Below the resonance the spring force dominates and displacement becomes proportional to electric signal so we lose effect A, which is bad because effect B still exists (unless you are in a small room where the chamber resonances play a role, but let's assume here free space). So usually below the woofer resonance there's a quadratic frequency response drop giving a low-frequency limitation.
- Tweeters at high frequency where their membranes are not small compared to the wavelength. Therefore effect B does not apply any more, which is bad because effect A still applies. So at that point we will have a drop in frequency response if frequency further increases. This can be somewhat mitigated because at high frequency the sound will also get more focused in the forward direction. But that's usually undesirable (and for instance dome tweeters are used to actually prevent it) so this high-frequency limitation also remains.
So the main part of the audio spectrum, say the whole 100Hz to 10kHz range, is simply described by effects A and B!