I am trying to show the matrix $(I - \gamma P^\pi)$ is always invertible, where P is a stochastic matrix (i.e. $P_{ij} \geq 0$, sum of the rows equal 1) and $\gamma \in [0,1)$. I found two sources that prove this in different way but I can't really understand either.
- From (https://ai.stanford.edu/~gwthomas/notes/mdps.pdf, page 10)
$$\begin{align*}||(I - \gamma P^\pi)x||_\infty &= ||x - \gamma P^\pi x||_\infty \\ &\geq ||x||_\infty - \gamma||P^\pi x||_\infty \\ &\geq ||x||_\infty - \gamma||x||_\infty \\ &> 0\end{align*} $$
I don't understand how the second inequality comes from. I guess it is true if $||Ax|| \leq ||A||\cdot||x||$ holds in general (even for infinity norm), since $||P||_\infty = 1$.
- From (http://researchers.lille.inria.fr/~lazaric/Webpage/MVA-RL_Course14_files/notes-lecture-02.pdf, page 17)
I am able to show that $P^\pi$ has all eigenvalues $\leq$ 1 and $(I - \gamma P^\pi)$ has eigenvalues $\geq 1 - \gamma$, then it's a PD matrix and thus invertible. However, what happens if P has some nonreal eigenvalues? I think it doesn't make sense to say it's $\leq$ 1? Does this proof handle that too?