Let's say, I want to establish a relation between the distribution of $Y$ and $p$ covariates. I denote the covariates with $x_1,x_2,...,x_p$. Then, I denote the linear predictor with $\eta$ , where $\eta = \beta_0 + \beta_1 x_1 + ~... ~+\beta_p x_p$ .
Thus, $E \left[Y|X \right]=\mu$, which is connected with the linear predictor by the response function $h$, where $\mu=h(\eta)$. Finally, I call $g$ the inverse of $h$, where $g= h^{-1}$. It must be noted that $\eta=g(\mu)$, and $g$ is called the link function.
Why does the link function $g$ make the model linear? The question seems a bit trivial, but is the answer that $\eta=g(\mu)$ is just a definition, in the sense that the linear predictor is linear because $g$ is a linear function?