Phase transitions in statistical mechanics are usually taught by working through a bunch of examples. I decided to try and think about them from a more "fundamental" point of view - but I've run into a weird little snag, in that I've come up with a system that seems to satisfy the most common definition of a (first order) phase transition, yet it's too trivially simple to really be interesting. Because of this, I'm wondering whether there's a way to define phase transitions in such a way that they include interesting examples such as the two dimensional Ising model, yet exclude the one I present below.
My reason for asking is not that I want to be finicky about definitions for their own sake, but rather because I want to get a handle on what special feature(s) less trivial models have that mine doesn't, which cause them to exhibit interesting effects such as scale free behaviour etc. around the transition point.
My trivial model is as follows. Consider a two-state (classical) system, which can either be in state 0 with energy $E_0 = 0$, or state 1 with energy $E_1 = \lambda \varepsilon $. Here $\varepsilon$ is a fixed value (with dimensions of energy) and $\lambda$ is a dimensionless parameter that I will vary later, representing the size of the system.
We assume the system is in a Boltzmann distribution with inverse temperature $\beta$. The probabilities $p_i$ of the system being in state $i$ are then as follows: $$ p_0 = 1/Z \quad\text{and}\quad p_1 = e^{-\beta \varepsilon \lambda}/Z, $$ with the normalisation factor ("partition function") being given by $Z=1+e^{-\beta \varepsilon\lambda}$.
We now consider changing the "scale" of the system, $\lambda$. The expected energy density in the system is given by $$u(\beta) = \frac{\langle E \rangle}{\lambda} = \frac{\varepsilon e^{-\beta\varepsilon\lambda}}{1+e^{-\beta\varepsilon\lambda}}.$$ It is not hard to see that in the limit of infinite $\lambda$ this becomes a step function, with $u=0$ if $\beta>0$ and $u=\varepsilon$ if $\beta<0$.
Similarly, the density of the log partition function (or dimensionless "free energy"), given by $$ \frac{\log Z}{\lambda} = \frac{\log(1+e^{-\beta \varepsilon \lambda})}{\lambda}, $$ becomes piecewise linear in the limit of large $\lambda$, exhibiting a discontinuity in its first derivative at $\beta=0$. In the limit, $\frac{\log Z}{\lambda} = -\beta \epsilon$ if $\beta<0$, and $0$ otherwise. One can also note that at the transition point, $\beta=0$, the variance of $E$ diverges.
So this system seems to exhibit the same discontinuities as nontrivial models of phase transitions --- yet it's just a simple two-state system. Intuitively this phenomenon is too simple to be called a phase change, so I would like to understand what the defining difference is between this type of behaviour and the non-trivial phase changes that can take place in Ising type models and in physical systems.
Alternatively, I suppose it's possible that some of the "interesting" features of phase transitions are present in my example, but it just takes some clever interpretation to get at them. If this is the case I would greatly appreciate an explanation.
Addendum. The model as described above has the advantage of being the simplest I could come up with, but it has the disadvantage that the transition point is at $\beta=0$, which corresponds to infinite temperature, since $\beta=1/k_BT$. This makes the model seem somewhat pathological. However, a small change to the model can make the transition take place at a finite, positive temperature. Making this change turns out to be quite instructive.
To achieve this we give the higher energy level some degeneracy. That is, there is still only one state with $E_0=0$, but there are now $g$ states with $E_i=\varepsilon$. (If the single state with $E=0$ is aesthetically displeasing, one can also give it a degeneracy; I have chosen not to do this for simplicity.) In this case, as before, $p_0=1/Z$ and $p_i = e^{-\beta\varepsilon\lambda}/Z$ for $i>0$, but now $Z=1+ge^{-\beta\varepsilon\lambda}$. In order to get a transition at a non-zero temperature we must make the additional assumption that the degeneracy scales exponentially with $\lambda$, i.e. $g=e^{a\lambda}$ for some dimensionless constant parameter $a>0$. In this case (as gatsu deduced in his answer), $$ u(\beta) = \frac{\varepsilon e^{(a-\beta\varepsilon)\lambda}}{1+e^{(a-\beta\varepsilon)\lambda}}. $$ In the limit of infinite $\lambda$ this has a transition point at $\beta = a/\varepsilon$, with $u$ being zero if $\beta$ is greater than this value, and $\varepsilon$ if it is below it. the "free energy" $\log Z(\beta)$ changes in a similar way, becoming piecewise linear with a discontinuity in its first derivative at $\beta = a/\varepsilon$.
As well as being less pathologically-behaved, this model also feels a bit less trivial, since it has many microstates rather than just two and, as gatsu points out, the transition now arises from a competition between the low energy of state $0$ and the entropy of the degenerate excited states. But nevertheless, as far as I can see, this model lacks such features as power law behaviour and symmetry breaking that are associated with phase transitions in other models. I'm interested in the essential difference between my model and those ones, which gives rise to such non-trivial behaviour around the critical point.