Denote the so-called canonical ensemble as $\mu$ :
$$ \mu(dq,dp)=Z^{-1}_\beta e^{-\beta H(q,p)}dqdp. $$
(with say $p,q \in \mathbb{R}^{3N}$ for some $N\in \mathbb{N}$ and $Z_\beta$ the normalisation constant).
It is not hard to prove that $\mu$ maximises the differential entropy :
$$S(\rho):=\int \rho \log \rho $$
over all distributions with fixed average energy. That is for some constant $E$ :
$$\mu=\text{argmax} \Big\{ S(\rho),~\rho \geq 0,~ \int \rho =1,~ \int H\rho = E \Big\}.$$
My question is : Why does this fact motivate one to model the microscopic distribution of a physical system as $\mu$, when the physical system has fluctuating energy (fixed average energy), and fixed temperature?