These are known definitions: We have a probability space $(\Omega, A, P)$
Conditional probability is defined through $P(A|B) = \frac{P(A \cap B)}{P(B)}, P(B) > 0$. This is a real nunmber.
Then also where is conditional expectation $E[X|A_0]$ with $A_0$ being some subalgebra. This is a random variable. In the special case when $A_0$ is generated by a random variable $Y$ we also write $E[X|\sigma(Y)] = E[X|Y]$ and using the factorization theorem you can also write $E[X|Y]$ as a RV of $Y$ which you then denote as $E[X|Y=y]$. For $X$ being in indicator function we sometimes also write $E[1_A | Y=y] =: P(A | Y = y)$ which confuses me very much, because this now is the evaluation of an only a.s. defined RV
I am particular confused with the proof I state in this Question.
Since the probability of $\{X_1 = i_1, ... X_{n-1} = i_{n-1}\}$ is 0 because the $X_i$ are real valued, the proof can't be talking about conditional probabilities. Thus I think the proof uses the sloppy notation for conditional expectation $E[1_B(X_n) | X_1 = i_1, ... X_{n-1} = i_{n-1}] =: P(X_n \in B | X_1 = i_1, ... X_{n-1} = i_{n-1})$. Now in this concept, I dont understand why the single steps in the proof are true. They are all obvious true for $P(|)$ being conditional probability, but in my eye not trivial for $P(|)$ being conditional expectations