2
$\begingroup$

I am self-learning introductory stochastic calculus from the text A first course in Stochastic Calculus, by Louis Pierre Arguin. I'm struggling to understand a particular step in the proof, and I would like to ask for some help on it. I've tried to search the same proof online, but I had a hard time following them.

Theorem.(Existence and uniqueness of the conditional expectation) Let $X$ be a random variable on $(\Omega,\mathcal{F},\mathbb{P})$. Let $Y$ be a random variable in $L^{2}(\Omega,\mathcal{F},\mathbb{P})$. Then the conditional expectation $\mathbb{E}[Y|X]$ is the random variable $Y^{\star}$ given in the equation (A). Namely, it is the random variable in $L^{2}(\Omega,\sigma(X),\mathbb{P})$ that is closest to $Y$ in the $L^{2}$-distance. In particular, we have:

  1. It is the orthogonal projection of $Y$ onto $L^{2}(\Omega,\sigma(X),\mathbb{P})$, that is $Y-Y^{\star}$ is orthogonal to any random variables in the subspace $L^{2}(\Omega,\sigma(X),\mathbb{P})$.

  2. It is unique.

Remark. This result reinforces the meaning of the conditional expectation $\mathbb{E}[Y|X]$ as the best estimation of $Y$ given the information of $X$: it is the closest random variable to $Y$ among all the functions of $X$ in the sense of $L^{2}$.

enter image description here

So, $Y^{\star}$ is such that:

$$\inf_{Z\in L^2(X)}\mathbb{E}[(Y-Z)^2] = \mathbb{E}[(Y-Y^{\star})^2] \tag{1}$$

Proof.

We write for short $L^{2}(X)$ for the subspace $L^{2}(\Omega,\sigma(X),\mathbb{P})$. Let $Y^{\star}$ be as in equation (A). We show successively that (1) $Y-Y^{\star}$ is orthogonal to any element of $L^{2}(X)$, so it is the orthogonal projection (2) $Y^{\star}$has the properties of conditional expectation in definition (3) $Y^{\star}$ is unique.

(1) Let $W=g(X)$ be a random variable in $L^{2}(X)$. We show that $W$ is orthogonal to $Y-Y^{\star}$; that is $\mathbb{E}[(Y-Y^{\star})W]=0$. This should be intuitively clear from the figure. On the one hand, we have by developing the square:

\begin{align*} \mathbb{E}[(W-(Y-Y^{\star}))^{2}] & =\mathbb{E}[W^{2}-2W(Y-Y^{\star})+(Y-Y^{\star})^{2}]\\ & =\mathbb{E}[W^{2}]-2\mathbb{E}[W(Y-Y^{\star})]+\mathbb{E}(Y-Y^{\star})^{2}] \tag{2} \end{align*}

On the other hand, $Y^{\star}+W$ is in $L^{2}(X)$(it is a linear combination of the elements in $L^{2}(X)$), we must have from equation (1):

$$\mathbb{E}[(W-(Y-Y^{\star}))^2] \geq \mathbb{E}[(Y-Y^{\star})^2]\tag{3}$$

I simply don't follow, how this last inequality (3) is arrived at.

Putting the last two equations together, we get that for any $W \in L^2(X)$,

$$\mathbb{E}[W^2]-2\mathbb{E}[W(Y-Y^{\star})]\geq 0 \tag{4}$$

In particular, this also holds for $aW$, in which case we get:

\begin{align*} a^2 \mathbb{E}[W^2] - 2a\mathbb{E}[W(Y-Y^{\star})^2] &\geq 0\\ a \{\mathbb{E}[W^2] - 2a\mathbb{E}[W(Y-Y^{\star})^2]\} \geq 0 \tag{5} \end{align*}

If $a > 0$, then :

$$a\mathbb{E}[W^2] - 2\mathbb{E}[W(Y-Y^{\star})^2] \geq 0 \tag{6a}$$

whereas if $a < 0$, then:

$$a\mathbb{E}[W^2] - 2\mathbb{E}[W(Y-Y^{\star})^2] \leq 0 \tag{6b}$$

Re-arranging (6a) yields:

$$\mathbb{E}[W(Y-Y^{\star})^2] \leq a\mathbb{E}[W^2]/2 \tag{7a}$$

and rearranging (6b) yields:

$$\mathbb{E}[W(Y-Y^{\star})^2] \geq a\mathbb{E}[W^2]/2 \tag{7b}$$

Since (7a) holds for all $a > 0$, it follows that, $\mathbb{E}[W(Y-Y^{\star})^2] \leq 0$. Since (7b) holds for all $a < 0$, $\mathbb{E}[W(Y-Y^{\star})^2] \geq 0$. Consequently,

$$\mathbb{E}[W(Y-Y^{\star})^2] = 0$$

$\endgroup$
5
  • $\begingroup$ Corrected the typo in parentheses. $\endgroup$
    – Quasar
    Commented Jun 11, 2023 at 18:51
  • $\begingroup$ Do you mean that that the reason why there is orthogonality is not clear to you? $\endgroup$
    – Math-fun
    Commented Jun 11, 2023 at 19:00
  • $\begingroup$ @Math-fun, I don't follow this inequality : $\mathbb{E}[W-(Y-Y^{\star})^2] \geq \mathbb{E}[(Y-Y^{\star})^2]$. Why is the distance between $W$ and $Y - Y^{\star}$ atleast as large as the length of the vector $(Y-Y^{\star})$ (in the $L^2$ sense)? How does one arrive at it? $\endgroup$
    – Quasar
    Commented Jun 11, 2023 at 19:06
  • $\begingroup$ $E(a-b)^2=Ea^2+Eb^2-2Eab$ and with $Eab=0$, you have $E(a-b)^2=Ea^2+Eb^2\geq Eb^2$. $\endgroup$
    – Math-fun
    Commented Jun 11, 2023 at 20:25
  • $\begingroup$ @Math-fun, we don't know yet if $Eab=0$. That's the first result we are trying to prove here. $\endgroup$
    – Quasar
    Commented Jun 12, 2023 at 4:08

1 Answer 1

1
$\begingroup$

Upon paying careful attention, we can make the following remark for inequality (3).

Since $W + Y^{\star}$ is an arbitrary vector in $L^2(X)$, we must have:

\begin{align*}\mathbb{E}[(W - (Y - Y^{\star}))^2] &= \mathbb{E}[(Y - (W + Y^{\star}))^2]\\ &\geq \inf_{Z \in L^2(X)} \mathbb{E}[(Y-Z)^2]\\ &= \mathbb{E}[(Y-Y^{\star})^2] \end{align*}

$\endgroup$

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .