7
$\begingroup$

Im having some trouble reconciling what I thought I learned about RN Derivatives as they relate to probability measures wikipedia, lecture notes with this blog post by John Baez mentioning it as it relates to KL divergences.

Specifically, when John says:

And by the way, in case you're wondering, the $d$ here doesn't actually mean much: we're just so brainwashed into wanting a $dx$ in our integrals that people often use $d$ for a measure even though the simpler notation $\mu$ might be more logical. So, the function:

$\frac{d\mu}{d\nu}$

is really just a ratio of probability measures, but people call it a Radon-Nikodym derivative, because it looks like a derivative (and in some important examples it actually is).

It seems to me that there is some deep intuition Im missing here. Could anyone please elaborate on this perspective?

Specifically: I'm asking:

  • To what degree is an RN derivative really just a ratio of probability measures?
  • If isn't technically a ratio of two measures, is it helpful to think of it sometimes as being that ratio? In what instances?

Thanks!

$\endgroup$
4
  • $\begingroup$ Your third link is just wikipedia again. Also, what exactly are you looking for? $\endgroup$
    – user223391
    Commented May 21, 2015 at 22:26
  • $\begingroup$ I think the $d$ represents a change in the normal sense of detivative and tis changing measure works for me..also the fact it appears in the area of option pricing derivatives adds a bonus to it ;). $\endgroup$
    – Chinny84
    Commented May 21, 2015 at 22:43
  • $\begingroup$ The change measure only allowed if the event can occur in both probablity measures..so if some think has a proability as outlined in one measure then it must also have a probability in the other. $\endgroup$
    – Chinny84
    Commented May 21, 2015 at 22:45
  • $\begingroup$ @avid19 sorry about that I fixed it. I am just looking for the "intuition" behind John's description of the RN derivative. $\endgroup$
    – Diego
    Commented May 21, 2015 at 22:52

2 Answers 2

7
$\begingroup$

The Radon-Nikodym “derivative” is an a.e. define concept. Suppose $(X,S)$ is a measure space and $\mu,\nu$ are finite measures on $(X,S)$ with $\mu\ll\nu$, then the theorem is:

Theorem. There exists $f\in L^{1}(X,\nu)$ a non-negative real-valued function, with $\mu(A)=\int_{x\in A} f(x)~\nu(dx)$ for all $A\in S$.

There are all sorts of generalisations (to $\sigma$-finite, signed, and complex-valued measures, etc.). This is the theorem/lemma in its simplest form. It is easy to show, that if $f,f'$ satisfy the theorem, then $f'=f$ $\nu$-a.e. (and thus also $\mu$-a.e.). Thus the function is “unique” warranting the definite article.

Since it effectively follows, that $\int g~d\mu=\int gf~d\nu$ for $g$ in $L^{1}(X,\mu)$, on may apply the nomenclature "$d\mu=f d\nu$", so that the notation $\frac{d\mu}{d\nu}$ for $f$ is an intuitive name. Since $f$ has the said property, it follows, that it is an “effective” derivative.


Here an outline of the proof/construction:

Lemma. Let $\sigma$ be a signed mass on $X$ (this means, countably additive and finite). Then there is a measurable set $A\subseteq X$ satisfying $(\forall{B\in S})~B\subseteq A\Rightarrow\sigma(B)\geq 0$ and $(\forall{B\in S})~B\subseteq X\setminus A\Rightarrow\sigma(B)\leq 0$. Thus $\sigma=\sigma^{+}-\sigma^{-}$, whereby $\sigma^{+}=\sigma\mid_{A}$ and $\sigma^{-}=\sigma\mid_{X\setminus A}$ are non-negative finite measures with disjoint support. Moreover, any such decomposition of $\sigma$ is of this form and $A$ is unique $\sigma$-a.e.

Proof of Theorem. (Sketch). Let $A_{r}\in S$ be as in the Lemma for the signed measure $\sigma_{r}:=r\nu-\mu$ for each $r\in\mathbb{Q}$. Consider now two rationals $r<r'$. Then $\sigma_{r'}=\sigma_{r}+(r'-r)\nu$. Let $B=A_{r}\setminus A_{r'}$. It holds per definition of the decomposition(s), that

$$0\geq\sigma_{r'}(B)=\sigma_{r}(B)+(r'-r)\nu(B)\geq 0+(r'-r)\nu(B),$$

thus $\nu(B)=0$. Thus $A_{r}\subseteq^{\ast}_{\nu}A_{r'}$. Replacing $A_{r}$ with the thus $\nu$-a.e. equivalent set $\bigcup_{q\in\mathbb{Q},q\leq r}A_{q}$, it can be assumed that $(A_{q})_{q\in(\mathbb{Q},<)}$ ist a monotone increasing family of measurable sets.

Consider finally the map $f:x\in X\mapsto \inf\{r\in\mathbb{Q}:x\in A_{r}\}\in[-\infty,\infty]$. This can readily be shown to be well-defined, measureable and $\nu$-a.e. $>\infty$. Let $B\in S$. For all reals, $r<r'$, it holds that $f^{-1}[r,r')=A_{r'}\setminus A_{r}$ $\nu$-a.e. and thus also $\mu$-a.e.; it therefore holds that

\begin{align} \int_{B\cap f^{-1}[r,r')}f~d\nu-\mu(B\cap f^{-1}[r,r')) &\geq r\nu(B\cap A_{r'}\setminus A_{r})-\mu(B\cap A_{r'}\setminus A_{r})\\ &= (r-r')\nu(B\cap A_{r'}\setminus A_{r}) +\sigma_{r'}(B\cap A_{r'}\setminus A_{r})\\ &\geq -(r'-r)\nu(B\cap A_{r'}\setminus A_{r}+0\\ &= -(r'-r)\nu(B\cap f^{-1}[r,r')).\\ \end{align}

and

\begin{align} \int_{B\cap f^{-1}[r,r')}f~d\nu-\mu(B\cap f^{-1}[r,r')) &\leq r'\nu(B\cap A_{r'}\setminus A_{r})-\mu(B\cap A_{r'}\setminus A_{r})\\ &= (r'-r)\nu(B\cap A_{r'}\setminus A_{r}) +\sigma_{r}(B\cap A_{r'}\setminus A_{r})\\ &\leq (r'-r)\nu(B\cap A_{r'}\setminus A_{r}+0\\ &= (r'-r)\nu(B\cap f^{-1}[r,r')).\\ \end{align}

Thus

$$\left|\int_{B\cap f^{-1}[r,r')}f~d\nu-\mu(B\cap f^{-1}[r,r'))\right| \leq (r'-r)~\nu(B\cap f^{-1}[r,r')).$$

It follows that, decomposing $B$ into countably many preimages under $f$ of sufficiently small intervals of the real line, that

$$\left|\int_{B}f~d\nu-\mu(B)\right|\leq\varepsilon\nu(B)$$

for all $\varepsilon>0$. Thus $\int_{B}f~d\nu=\mu(B)$ for all $B\in S$ (in particular $f\in L^{1}(\nu)$).

Q.e.d.


Interpreting the derivative (aside from its application).

The construction in the proof allows one to describe the derivative. It holds for $\varepsilon>0$ and reals $a<b$ with $|b-a|<\varepsilon$ and $\nu$-a.e. $x\in X$ that $a\leq\frac{d\mu}{d\nu}(x)<b$ if and only if $x\in A_{b}\setminus A_{a}=:B$, which means the set of such $x$ constitutes a set $B$ with $b\nu(B)-\mu(B)=\sigma_{b}(B)\geq 0$ and $a\nu(B)-\mu(B)=\sigma_{a}(B)\leq 0$.

It immediately follows, that either $\frac{d\mu}{d\nu}$ is $\nu$-a.e. not in $[a,b)$, or else $a\leq\frac{\mu(B)}{\mu(B)}\leq b$. Thus for $|b-a|<\varepsilon$ small

$$\left|\frac{d\mu}{d\nu}(x)-\frac{\mu(B)}{\nu(B)}\right|<\varepsilon$$

for $\nu$-a.e. $x\in B:=\{y\in X:\frac{d\mu}{d\nu}(y)\in[a,b)\}$.

Or in other words: let $f=\frac{d\mu}{d\nu}$. Then for $\nu$-a.e. $x\in X$ it holds that $\frac{d\mu}{d\nu}(x)=\lim_{\varepsilon\to 0^{+}}\frac{\mu(f^{-1}U_{\varepsilon,x})}{\nu(f^{-1}U_{\varepsilon,x})}$, where $U_{\varepsilon,x}\subseteq\mathbf{R}$ the $\varepsilon$-interval around $f(x)\in\mathbf{R}$.

$\endgroup$
4
  • 2
    $\begingroup$ “is really just a ratio of probability measures” — sorry, this is just far too oversimplified, it is at most an highly localised ratio of measures. But certainly NOT equal to $\mu(w)/\nu(w)$ for $w\in\Omega$. The former does not even make any sense, as the notation should be $\mu(\{w\})$ and this is typically $0$. At most one may consider $\frac{d\mu}{d\nu}(w)$ as a limit of $\mu(A)/\nu(A)$ for certain measurable sets $A$ $\ldots$ but even this is wrong-headed. Measures are really messy. To prove RN, one applies a decomposition result for signed measures to $\mu-q\nu$ for all $q\in\mathbf{R}$. $\endgroup$
    – Thomas
    Commented May 21, 2015 at 23:04
  • $\begingroup$ Thomas, Is this comment meant to address @Simon's answer? $\endgroup$
    – Diego
    Commented May 21, 2015 at 23:46
  • $\begingroup$ Diego, I am commenting on what “John says” as well as Simon Rigbys (former) remark. $\endgroup$
    – Thomas
    Commented May 22, 2015 at 14:08
  • $\begingroup$ thank you very much for your thorough answer. Particularly useful for me was the "interpreting" section! $\endgroup$
    – Diego
    Commented May 22, 2015 at 18:17
4
$\begingroup$

The Radon-Nikodym derivative is a thing which re-weights the probabilities, i.e. it is a ratio of two probability densities or masses. It is used when moving from one measure to another, for whatever reason you have to do so. So, say $X$ is a random variable and you want to work out $\mathbb{E}_\lambda[X]$ - i.e.the expectation of $X$ in $(\Omega,\mathcal{F},\lambda)$ - but you know more about the measure $\mu$. If you have the RD derivative $d\lambda/d\mu$ then you can find that $\mathbb{E}_\lambda[X] = \mathbb{E}_\mu[\frac{d\lambda}{d\mu}X]$. This is what your Proposition 1.7 is saying I think.

I'm not sure what examples the quoted author is referring to where the Radon-Nikodym derivative really is a derivative. This might be interesting to learn.

$\endgroup$
3
  • $\begingroup$ Is the RD derivative you write about somehow related to the Radon-Nikodym derivative? $\endgroup$
    – GEdgar
    Commented May 21, 2015 at 23:37
  • $\begingroup$ @GEdgar I think they're the same thing $\endgroup$
    – Diego
    Commented May 21, 2015 at 23:45
  • $\begingroup$ @Stanley I think this problem: math.stackexchange.com/questions/2524383/… has to do with the Radon-Nikodym derivative, although I've never heard it called that. Can you please help me with it? Thank you. $\endgroup$
    – user100463
    Commented Nov 17, 2017 at 11:48

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .