60
$\begingroup$

Suppose $X$ is a real-valued random variable and let $P_X$ denote the distribution of $X$. Then $$ E(|X-c|) = \int_\mathbb{R} |x-c| dP_X(x). $$ The medians of $X$ are defined as any number $m \in \mathbb{R}$ such that $P(X \leq m) \geq \frac{1}{2}$ and $P(X \geq m) \geq \frac{1}{2}$.

Why do the medians solve $$ \min_{c \in \mathbb{R}} E(|X-c|) \, ? $$

$\endgroup$

6 Answers 6

64
$\begingroup$

For every real valued random variable $X$, $$ \mathrm E(|X-c|)=\int_{-\infty}^c\mathrm P(X\leqslant t)\,\mathrm dt+\int_c^{+\infty}\mathrm P(X\geqslant t)\,\mathrm dt $$ hence the function $u:c\mapsto \mathrm E(|X-c|)$ is differentiable almost everywhere and, where $u'(c)$ exists, $u'(c)=\mathrm P(X\leqslant c)-\mathrm P(X\geqslant c)$. Hence $u'(c)\leqslant0$ if $c$ is smaller than every median, $u'(c)=0$ if $c$ is a median, and $u'(c)\geqslant0$ if $c$ is greater than every median.

The formula for $\mathrm E(|X-c|)$ is the integrated version of the relations $$(x-y)^+=\int_y^{+\infty}[t\leqslant x]\,\mathrm dt$$ and $|x-c|=((-x)-(-c))^++(x-c)^+$, which yield, for every $x$ and $c$, $$ |x-c|=\int_{-\infty}^c[x\leqslant t]\,\mathrm dt+\int_c^{+\infty}[x\geqslant t]\,\mathrm dt $$

$\endgroup$
18
  • $\begingroup$ Thanks! (1) By "integrated version", is it to first integrate your last formula wrt $P$ over $\mathbb{R}$, then apply Fubini Thm to exchange the order of the two integrals, and so get the first formula? (2) If temporarily change the notation $P$ to represent the cdf of $X$, is Sivaram's Edit correct? Specifically, do those Riemann-Stieltjes integrals exist? $\endgroup$
    – Tim
    Commented Nov 25, 2011 at 8:16
  • 7
    $\begingroup$ This is a very nice proof. A clarification for future readers, who, like me, would be perplexed by the notation $[x\leq-t]$ and $[x\geq t]$: If $A$ is an event, $[A]$ denotes the indicator function $\mathbb{1}_A$. In particular, $[x\leq-t] = \mathbb{1}_{\{x \leq -t\}}$, and likewise $[x\geq t] = \mathbb{1}_{\{x\geq t\}}$. $\endgroup$
    – Evan Aad
    Commented Oct 18, 2016 at 9:31
  • 2
    $\begingroup$ @EvanAad Adding convexity to the pot will allow you to conclude. $\endgroup$
    – Did
    Commented Oct 18, 2016 at 16:11
  • 10
    $\begingroup$ @Vim The convexity of the function $u$ makes these objections moot, but here is a more direct route: for every $x$ and $c$, $$ |x-c|=\int_{-\infty}^c[x\leqslant t]\,\mathrm dt+\int_c^{+\infty}[x>t]\,\mathrm dt $$ hence, for every median $m$, $$E(|X-c|)=E(|X-m|)+\int_m^cv(t)dt$$ with $$v(t)=P(X\leqslant t)-P(X>t)=2P(X\leqslant t)-1$$ Then $v$ is nondecreasing and $v(m)\geqslant0$ hence, for every $c>m$, $v\geqslant0$ on $(m,c)$, which implies $E(|X-c|)\geqslant E(|X-m|)$. Likewise for $c<m$. $\endgroup$
    – Did
    Commented Nov 24, 2016 at 6:03
  • 1
    $\begingroup$ why is the first equation true? I´ve tried to find an explanation on the net but it seems to be intuitiv for everyone but me $\endgroup$
    – Lillys
    Commented Nov 16, 2020 at 16:34
21
$\begingroup$

Let $f$ be the pdf and let $J(c) = E(|X-c|)$. We want to maximize $J(c)$. Note that $E(|X-c|) = \int_{\mathbb{R}} |x-c| f(x) dx = \int_{-\infty}^{c} (c-x) f(x) dx + \int_c^{\infty} (x-c) f(x) dx.$

To find the maximum, set $\frac{dJ}{dc} = 0$. Hence, we get that, $$\begin{align} \frac{dJ}{dc} & = (c-x)f(x) | _{x=c} + \int_{-\infty}^{c} f(x) dx + (x-c)f(x) | _{x=c} - \int_c^{\infty} f(x) dx\\ & = \int_{-\infty}^{c} f(x) dx - \int_c^{\infty} f(x) dx = 0 \end{align} $$

Hence, we get that $c$ is such that $$\int_{-\infty}^{c} f(x) dx = \int_c^{\infty} f(x) dx$$ i.e. $$P(X \leq c) = P(X > c).$$

However, we also know that $P(X \leq c) + P(X > c) = 1$. Hence, we get that $$P(X \leq c) = P(X > c) = \frac12.$$

EDIT

When $X$ doesn't have a density, all you need to do is to make use of integration by parts. We get that $$\displaystyle \int_{-\infty}^{c} (c-x) dP(x) = \lim_{y \rightarrow -\infty} (c-y) P(y) + \displaystyle \int_{c}^{\infty} P(x) dx.$$ Similarly, we also get that $$\displaystyle \int_{c}^{\infty} (x-c) dP(x) = \lim_{y \rightarrow \infty} (y-c) P(y) - \displaystyle \int_{c}^{\infty} P(x) dx.$$

$\endgroup$
5
  • $\begingroup$ Thanks! But does $X$ always have a density? $\endgroup$
    – Tim
    Commented Nov 25, 2011 at 7:09
  • $\begingroup$ @Tim: I don't think it is hard to adapt the same idea for the case when $X$ doesn't have a density. $\endgroup$
    – user17762
    Commented Nov 25, 2011 at 7:15
  • $\begingroup$ So you are thinking $P$ as cdf of $X$? $\endgroup$
    – Tim
    Commented Nov 25, 2011 at 7:30
  • $\begingroup$ don't you want to say minimize instead of maximize ? $\endgroup$
    – Valentin
    Commented Feb 18, 2021 at 13:11
  • $\begingroup$ It doesn't change the structure of the proof but could there be a typo in $\frac{dJ}{dc} = (c-x)f(x) | _{x=c} + \int_{-\infty}^{c} f(x) dx + (x-c)f(x) | _{x=c} - \int_c^{\infty} f(x) dx$? Applying en.wikipedia.org/wiki/Leibniz_integral_rule, it seems that the third term should be $- (x-c)f(x) | _{x=c}$ instead of $+ (x-c)f(x) | _{x=c}$. $\endgroup$
    – FZS
    Commented Nov 28, 2022 at 2:49
18
$\begingroup$

Let $m$ be any median of $X$. Wlog, we can take $m=0$ (consider $X':=X-m$). The aim is to show $E|X-c|\ge E|X|$.

Consider the case $c\ge 0$. It is straightforward to check that $|X-c|-|X|=c$ when $X\le0$, and $|X-c|-|X|\ge -c$ when $X>0$. It follows that $$ (|X-c|-|X|)\,I(X\le0)=c\,I(X\le0)\tag1 $$ and $$(|X-c|-|X|)\,I(X>0)\ge-c\,I(X>0).\tag2 $$ Adding (1) and (2) and taking expectation yields $$ E(|X-c|-|X|)\ge c\left[P(X\le0)-P(X>0)\right].\tag3 $$ The RHS of (3) equals $c\,[2P(X\le0)-1]$, which is non-negative since $c\ge0$ and zero is a median of $X$. The case $c\le0$ is reduced to the previous one by considering $X':=-X$ and $c':=-c$.

$\endgroup$
5
  • $\begingroup$ Sorry, I know this should be easy but would you be able to elaborate on how the case $c\leq 0$ reduces to the case you proved? $\endgroup$
    – EE18
    Commented Oct 18, 2020 at 21:31
  • 2
    $\begingroup$ @1729_SR In the case $c\le0$, define $c':=-c$ and $X':=-X$. Then $0$ is still a median of $X'$, while $c'\ge0$, so by the just-proved case we deduce $E|X'-c'|\ge E|X'|$. Now observe that $|X'-c'|=|X-c|$ and $|X'|=|X|$. Substituting, we conclude $E|X-c|\ge E|X|$. $\endgroup$
    – grand_chat
    Commented Oct 19, 2020 at 2:42
  • $\begingroup$ Thanks very much for the clarification. I suspected the proof would go something like that (and I do promise that I wrestled with it before asking!). When I did the problem, I didn't make the slick "take $m=0$ argument that you did so I actually kept $m$ all the way through. Thus I was missing the key bit that $m(x) = -m(-x)$ where here I am interpreting $m$ as a function producing the (a?) median of a given random variable. Thanks again! $\endgroup$
    – EE18
    Commented Oct 19, 2020 at 12:38
  • $\begingroup$ how is this without loss? If replace $X:=X-m$, we can conclude that $\mathbb{E}|X-m-c|-\mathbb{E}|X-m|\geq c(\mathbb{P}(X\leq m)-1)\geq 0$, then we see $\mathbb{E}|X-m-c|\geq \mathbb{E}|X-m|$ Then, we can only conclude that $\mathbb{E}|X-m-c|$ is minimized at $c=0$? $\endgroup$ Commented Oct 23, 2023 at 12:04
  • $\begingroup$ @JacobsonRadical The statement $E|X-m-c|\ge E|X-m|$ for all $c$ is equivalent to the statement $E|X-c|\ge E|X-m|$ for all $c$, which is what OP wants to prove. $\endgroup$
    – grand_chat
    Commented Nov 6, 2023 at 0:36
6
$\begingroup$

The following intends to complement Did's answer.

Claim

Denote by $M$ be the set of $X$'s medians. Then

  1. $M = [m_1, m_2]$ for some $m_1, m_2 \in \mathbb{R}$, such that $m_1 \leq m_2$.

  2. For every $m \in M$ and for every $x \in \mathbb{R}$ we have $$ E\left(|X-m|\right) \leq E\left(|X-x|\right). $$ (In particular, $m\mapsto E\left(|X-m|\right)$ is constant on $M$.)

Part 2's proof builds on Did's answer.

Proof

  1. It is known that $M \neq \emptyset$. Define $$ \begin{align} M_1 &:= \left\{t\in\mathbb{R}\ |\!:\ F_X(t) \geq \frac{1}{2}\right\}, \\ M_2 &:= \left\{t\in\mathbb{R}\ |\!:\ P(X<t) \leq \frac{1}{2}\right\}. \end{align} $$ Then $M = M_1 \cap M_2$. It therefore suffices to show that $M_1 = [m_1, \infty)$ and that $M_2 = (-\infty, m_2]$, for some $m_1, m_2 \in \mathbb{R}$.

    Since $\lim_{t\rightarrow-\infty}F_X(t) = 0$, $M_1$ is bounded from below. Since $\lim_{t\rightarrow\infty}F_X(t) = 1$, $M_1$ is an interval that extends to infinity. Hence $M_1 = (m_1,\infty)$ or $M_1 = [m_1,\infty)$, for some $m_1 \in \mathbb{R}$. It follows from $F_X$'s right-continuity that $m_1 \in M_1$. An analogous argument shows that $M_2 = (-\infty,m_2]$ (just verify that $t\mapsto P(X<t)$ is left-continuous).

  2. Define a function $f:\mathbb{R}\rightarrow\mathbb{R}$ as follows. For every $c \in \mathbb{R}$, set $$ f(c) := E\left(|X-c|\right). $$

    We will begin by showing that $f$ is convex. Let $a, b \in \mathbb{R}$, and let $t \in (0,1)$. Then $$ \begin{align} f\left(ta+(1-t)b\right) &= E\left(\left|X-\left(ta+(1-t)b\right)\right|\right) \\ &= E\left(\left|\left(tX-ta\right)+\left((1-t)X-(1-t)b\right)\right|\right) \\ &\leq E\left(\left|\left(tX-ta\right)\right|+\left|\left((1-t)X-(1-t)b\right)\right|\right) \\ &=E\left(\left|\left(tX-ta\right)\right|\right)+E\left(\left|\left((1-t)X-(1-t)b\right)\right|\right) \\ &= t\ E\left(|X-a|\right) + (1-t)\ E\left(|X-b|\right) \\ &= t\ f(a) + (1-t)\ f(b). \end{align} $$

    Since $f$ is convex, then, by Theorem 7.40 of [1] (p. 157), there exists a set $A \subseteq \mathbb{R}$ such that $\mathbb{R}\setminus A$ is countable, and such that $f$ is finitely differentiable on $A$. Moreover, letting $m \in M$, and letting $x \in (-\infty, m_1)$, Theorem 7.43 of [1] (p. 158) yields that $f'$ is Lebesgue-integrable on $[x,m] \cap A$, and that $$ f(m) - f(x) = \int_{[x,m]\cap A} f'\ d\lambda. $$

    Applying Did's answer, we find that $f'\leq 0$ on $[x,m]\cap A$. Hence $f(m) \leq f(x)$. Similar considerations show that, for every $x \in (m_2,\infty)$, $f(m) \leq f(x)$, and also that $f(m) = f(m_1)$ (implying that $f$ is constant on $M$, since $m$ was chosen arbitrarily in $M$).

    (The argument of the last paragraph was suggested to me by copper.hat in their answer to a related question of mine.)

Q.E.D.


References

[1] Richard L. Wheeden and Antoni Zygmund. Measure and Integral: An Introduction to Real Analysis. 2nd Ed. 2015. CRC Press. ISBN: 978-1-4987-0290-4.

$\endgroup$
5
  • $\begingroup$ Thanks. Now I understand why if $M$ is an interval then any point where $f$ assumes zero derivative is a global minimiser of $f$. However, what if $M$ is a singleton and $f$ is not differentiable there? (Also, could you give the name of the Lebesgue integrability theorem you invoked?) $\endgroup$
    – Vim
    Commented Nov 24, 2016 at 3:17
  • 1
    $\begingroup$ @Vim: 1. A singleton $\{s\}$ is an interval of the from $[m_1, m_2]$, $m_1\leq m_2$ with $m_1:=m_2:=s$. 2. Here's a link to the theorem. $\endgroup$
    – Evan Aad
    Commented Nov 24, 2016 at 9:54
  • $\begingroup$ I was not asking whether a singleton is an interval or not, rather, I was thinking the how to apply the convexity in this case. Anyway it seems already solved to me now: even though $f$ can fail to be differentiable at this point, its left and right derivatives surely exist by convexity, and the left one is $\le 0$ and the right one $\ge 0$ by the definition of the median. $\endgroup$
    – Vim
    Commented Nov 24, 2016 at 10:06
  • $\begingroup$ @Vim: My proof covers the case that $M$ is a singleton. $\endgroup$
    – Evan Aad
    Commented Nov 24, 2016 at 12:33
  • $\begingroup$ indeed. I had been actually mainly reading the link in your answer, which seemed a bit simpler so that I could grasp it with less knowledge basis. The singleton concern arose from there but not from this answer. $\endgroup$
    – Vim
    Commented Nov 24, 2016 at 12:40
1
$\begingroup$

Let $Y=\left|X-c\right|$,

Then, $$E(Y) = \int_0^\infty \left(1-F_Y(y)\right) dy$$

Note that, $F_Y(y) = F_X(c+y) - F_X(c-y),$

Thus $$ \begin{align} E(Y) &= \int_0^\infty \big( 1-F_X(c+y) + F_X(c-y) \big) dy \\ \frac{d E(y)} {dc} &= \int_0^\infty \big(-f_X(c+y) + f_X(c-y) \big) dy \\ &=\int_0^\infty f_X(c-y) dy - \int_0^\infty f_X(c+y) dy \\ &= \int_{-\infty}^c f_X(x) dx - \int^{\infty}_c f_X(x) dx \\ & = F_X(c) - (1 - F_X(c)) \end{align} $$

Equating it to zero, we have,

$$F_X(c) = \frac{1}{2}$$

Hence median is the minimiser of $E(|X-c|)$.

$\endgroup$
0
$\begingroup$

Due to the fact that

$\forall x,c \in \mathbb{R}, \\|x-c| = (x-c)\unicode{x1D7D9}_{\{x>c\}} + (c-x)\unicode{x1D7D9}_{\{x\leq c\}} \\= \int_c^x \unicode{x1D7D9}_{\{x>c\}} \,dt + \int_x^c \unicode{x1D7D9}_{\{x \leq c\}} \,dt \\ = \int_c^{\infty} \unicode{x1D7D9}_{\{t<x\}} \,dt + \int_{-\infty}^{c} \unicode{x1D7D9}_{\{t \geq x\}} \,dt,$

$\forall$ continuous real-valued random variable $X$, $c \in \mathbb{R},$ by linearity of expectation,

$\mathbb{E}[|X-c|] = \mathbb{E}[(X-c)\unicode{x1D7D9}_{\{X>c\}}] + \mathbb{E}[(c-X)\unicode{x1D7D9}_{\{X\leq c\}}] $

$=\int_{-\infty}^{\infty} \int_c^{\infty} \unicode{x1D7D9}_{\{x>t\}} \,dt \,dx + \int_{-\infty}^{\infty} \int_{-\infty}^c \unicode{x1D7D9}_{\{x \leq t\}} \,dt \,dx.$

Since random variable and indicator function are measurable, by Fubini's theorem,

$= \int_c^{\infty} \int_{-\infty}^{\infty} \unicode{x1D7D9}_{\{t<x\}} \,dx \,dt + \int_{-\infty}^c \int_{-\infty}^{\infty} \unicode{x1D7D9}_{\{t \geq x\}} \,dx \,dt$

$=\int_c^{\infty} \mathbb{E}[\unicode{x1D7D9}_{\{t<X\}}] \,dt + \int_{-\infty}^c \mathbb{E}[\unicode{x1D7D9}_{\{t \geq X\}}] \,dt\\= \int_c^{\infty} \mathbb{P}(X>t) \,dt + \int_{-\infty}^c \mathbb{P}(X \leq t) \,dt.$

By Leibniz's integral rule, the first-order condition of $\mathbb{E}[|X-c|]$ is

$0=\partial_c \mathbb{E}[|X-c|] = (0-\mathbb{P}(X > c)) + (\mathbb{P}(X \leq c) - 0)$

$\implies 2\mathbb{P}(X \leq c) -1 = 0 \implies \mathbb{P}(X \leq c) = \frac{1}{2} = \mathbb{P}(X > c) = \mathbb{P}(X \geq c)$.

The last inequality holds since $X$ is a continuous real-valued random variable. The probability measure of a singleton is of measure $0$.

By definition, $c$ is the median of $X$ when $\mathbb{E}[|X-c|]$ is minimized.

$\endgroup$

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .