0
$\begingroup$

I'm currently having a confusion dealing with the conditional expectation. Let's recall the definition first:

Let $X$ be an integrable function (or a random variable) defined on a probability space $(\Omega,\mathfrak{M},\mu)$ and let $\mathfrak{A}$ be a $\sigma$-subalgebra of $\mathfrak{M}$. The conditional expectation of $X$ on $\mathfrak{A}$ is then defined to be an $\mathfrak{A}$-measurable function $\mathbb{E}(X|\mathfrak{A})$ which satisfies the identity $$\int_AXd\mu=\int_A\mathbb{E}(X|\mathfrak{A})d\mu $$ for any $A\in\mathfrak{A}$. The existence and uniqueness follow from the Radon-Nikodym theorem. Two special cases are as follows:

(1) If $X=\chi_{\omega}$ for some $\omega\in\mathfrak{M}$, $\mathbb{E}(\chi_{\omega}|\mathfrak{A})$ is called the conditional probability of $\omega$ (relative to $\mathfrak{A}$).

(2) If $\mathfrak{A}$ is the $\sigma$-algebra generated by a function $Y$, we write $\mathbb{E}(X|\mathfrak{A})=\mathbb{E}(X|Y)$.

Q1. Why is the conditional "probability" a function, not a number?

Q2. What is the intuitive meaning of $\mathbb{E}(X|Y)$?

As for Q2, I think that $\mathbb{E}(X|Y)$ should mean an expectation of $X$ provided that $Y$ happened (as in elementary probability theory); for example, the symbol $\mathbb{E}(X|Y=y)$ should indicate an expectation of $X$ when $Y=y$. However, I found it hard to connect this idea and the above definition. I'm not even 100% sure about this; anyway, $\mathbb{E}(X|Y)$ is a function, not an expectation (which is a number).

I'm familiar with measure theory, but not with probability theory. So any help would be appreciated! Thank you.

$\endgroup$
1
  • $\begingroup$ You seem to switch between $Y$ being an event and being a random variable. If $Y$ is a random variable and $\mathbb{E}(X\mid Y=y)$ varies as $y$ varies then it depends on $y$ and so can be described as a function of $y$ with $f(y)=\mathbb{E}(X\mid Y=y)$. So you can say $\mathbb{E}(X\mid Y)=f(Y)$ is a function of the random variable $Y$ and is itself a random variable rather than a number. $\endgroup$
    – Henry
    Commented Mar 13 at 15:31

1 Answer 1

1
$\begingroup$

First, since conditional probabilities are conditional expectations of indicator functions, you can forget about that concept as a separate thing.

Moving on: for intuition, it is helpful to think about finite probability spaces with no nonempty null sets. This makes some statements that are "morally true" in the general case be literally true.

Consider for example $\Omega=\{ 1,2,3 \}$ with the underlying $\sigma$-algebra being $\mathcal{F}=P(\Omega)$ and then think about $\mathcal{G}=\sigma(\{ \{ 1 \} \})=\{ \Omega,\emptyset,\{ 1 \},\{ 2,3 \} \}$. Say $P(\{ i \})=p_i>0$ for all $i$.

Conditional expectation of some rv $X$ with respect to this $\mathcal{G}$ intuitively means you have an instrument that can distinguish whether $\omega=1$ or not, but it can't tell you whether $\omega=2$ or $\omega=3$. Then you're asked to use your instrument to give a "best guess" of $X$ for each $\omega$. This will depend on $\omega$: if $\omega=1$ then it will be $X(1)$ but otherwise it should be somewhere between $X(2)$ and $X(3)$. The measurability requirement forces $E[X \mid \mathcal{G}](2)=E[X \mid \mathcal{G}](3)$. Then the identity tells you $p_2 X(2) + p_3 X(3) = (p_2 + p_3) E[X \mid \mathcal{G}](2)=(p_2 + p_3)E[X \mid \mathcal{G}](3)$, so they are both $\frac{p_2 X(2) + p_3 X(3)}{p_2 + p_3}$.

As for $E[X \mid Y]$, again in the setting of a finite space with no nonempty null sets, you have that $\sigma(Y)$ is the minimal $\sigma$-algebra containing all the preimages of the singletons containing each element of the range of $Y$. Therefore in this setting, $E[X \mid Y](\omega)$ is the same as $E[X \mid Y=Y(\omega)]$ in the sense you have seen previously. Conditional expectation is "the right generalization" of this concept to the situation where $Y=c$ can be nonempty but null.

$\endgroup$
2
  • $\begingroup$ Thanks a lot! The relation $E[X|Y](\omega)=E[X|Y=Y(\omega)]$ especially helped me a lot in understanding what is going on. $\endgroup$ Commented Mar 13 at 15:19
  • $\begingroup$ @MintChocolate Just to be clear, that relation only literally holds when $P(Y=Y(\omega)) \neq 0$. Otherwise, the RHS isn't well defined. The precise statement is "$E[X \mid Y](\omega_0)=\lim_{\epsilon \to 0^+} E[X \mid \{ \omega : d(Y(\omega),Y^{-1}(\{ Y(\omega_0) \}))<\epsilon \}]$ for almost all $\omega_0$", which is a lot less intuitive than the version in my answer. $\endgroup$
    – Ian
    Commented Mar 13 at 16:42

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .