Background
A linear time-invariant (LTI) system (black box) is one described by the system: \begin{align} \dot{\xi}(t) & = A\xi(t) + B\omega(t), \; \xi(0) = 0 \label{eq-abc-1}\\ \lambda(t) & = C\xi(t) \end{align} where $A \in \mathbb{R}^{n\times n}$ ($n$ is the dimention of the system) and $B \in \mathbb{R}^{n\times m}$ ($m$ is the dimention of unputs of the BB model), and $C \in \mathbb{R}^{n\times p}$ ($p$ is the dimention of outputs). $\xi$ describes the states of the system and $\lambda$ is the output function.
Out of this system of ODE (called state space representation), the concepts of impulse response, impulse, convolution, transfer function etc, arises.
In acoustics, the concept of impulse response (IR) is often used for reverberation of a room: one generates a loud short duration sound in a room in order to acquire the room IR, then one can convolve the IR with any sound signal and simulate the response of the room quite successfully.
In this particular case, we are thinking of the system whose inputs are the emission of a sound wave from a fixed point $P_0$, and the outputs are the resulting sound in a fixed point $P_1$ in the same room.
My question
In physics (acoustics) how does one justify that there is a ``state space'' representation of this system?, how do you justify that you can apply all the theory of LTI systems to sound propagation?
I'm asking this, because that's what is assumed when one measures the IR of a room and many other acoustical scenarios.
I've seen justifications like the ray tracing approach, where you are just adding the same input signal a lot of times lagged due to the reflections of the room, and you magically have a convolution, but my concern is if there is a justification involving differential equations in acoustics that have a state space representation.
If there is a need for clarification in my question just let me know.