Is there a simpler proof than mine for this obvious proposition about correlations?

Question

$\newcommand{\e}{\operatorname E}$"Obviously" if $g$ is a weakly increasing function and $X$ and $g(X)$ are both random variables with finite variance, then the covariance (and hence the correlation) between $X$ and $g(X)$ is non-negative. But there is the question of how to prove this "obvious" proposition. My question is whether something simpler than what I wrote should be used instead of what I wrote.

\begin{align} \operatorname{cov}(X,g(X)) = {} & \e\Big( \big(X-\e(X)\big)\big(g(X) - \e(g(X))\big) \Big) \\[6pt] = {} & \e\Big( \big(X-\e(X)\big) \big( (g(X) - g(\e(X))) + (g(\e(X)) - \e(g(X)))\big) \Big) \\[6pt] = {} & \e\Big(\big(X-\e(X)\big)\big( g(X) - g(\e(X))\big)\Big) \\ & + \e\Big( \big( X-\e(X)\big)\big( \underbrace{g(\e(X)) - \e(g(X)} \big) \Big) \end{align}

Now an essential point is that the expression over the $\underbrace{\text{underbrace}}$ is constant, i.e. not random. Therefore it can be pulled out of the outermost expectation operator just after the "plus" sign. Then we're left with $$ \e\big( X-\e(X)\big) $$ and that is zero. In the first term we now have the expected value of $$ \big( X-\e(X)\big)\big( g(X) - g(\e(X))\big) $$ and that random variable is nonnegative with probability $1.$

This leaves me feeling that I may have missed a simpler proof, just because the proposition stated just after the scare-quoted "Obviously" above seems obvious.

Did I miss a simpler proof?

You had a typo in the first underbrace -- which should be $g(E(X)) - E(g(X))$. By the way, I think your proof is already clear and straightforward enough (as well as the underlying idea), hence further essential reduction seems unlikely. — Zhanxiong, Commented Sep 16, 2023 at 2:59

whuber · Accepted Answer · 2023-09-17 15:14:57Z

Let $X_i$ be independent random variables with the same distribution as $X.$ Because $g$ is weakly increasing if and only if $(g(x_2)-g(x_1))(x_2-x_1)\ge 0$ for all real numbers $x_i,$

$$\operatorname{Cov}(X,g(X)) = \frac{1}{2} E[(X_2-X_1)(g(X_2)-g(X_1))] \ge \frac{1}{2}E[0] = 0,$$

QED. (See How would you explain covariance to someone who understands only the mean? for an explanation of this formula for covariance.)

An alternative is to exploit the basic invariance property of covariance with respect to changes of location. Observe that $g$ is weakly increasing if and only if $x\to g(x)+a$ is weakly increasing for any constant $a.$ Thus, without any loss of generality you may assume $x g(x)\ge 0$ for all $x$ by choosing $a = -g(0).$ Consequently, initially shifting $X$ if necessary to make $E[X]=0,$

$$\operatorname{Cov}(X,g(X)) = E[X g(X)]\ge E[0] = 0,$$

QED.

Illustrating both approaches is this visual proof without words.

Ben · Accepted Answer · 2023-09-17 01:46:01Z

Your proof is already quite simple, except that you depart from the algebra to state the middle step in prose, which makes it longer and more amorphous than it needs to be. Here it would be acceptable to just take the constant outside the expectation operator and rely on the reader to see why this is allowed, or you could also add a postscript remark at the end of the equations justifying the step. Another thing that would be useful is to explicitly specify where the "weakly increasing" condition comes into the proof. The only other thing you could do to help simplify the presentation is to use some different types of brackets to make them easier to visually differentiate. Here is a simpler way to present the same exact argument.

For any function $g$ you have:

$$\begin{align} \mathbb{Cov}(X, g(X)) &= \mathbb{E} \big[(X - \mathbb{E}(X))(g(X) - \mathbb{E}(g(X))) \Big] \\[12pt] &= \mathbb{E} \big[(X - \mathbb{E}(X)) (g(X) - g(\mathbb{E}(X)) + g(\mathbb{E}(X)) - \mathbb{E}(g(X))) \Big] \\[12pt] &= \mathbb{E} \Big[ (X - \mathbb{E}(X)) (g(X) - g(\mathbb{E}(X))) \Big] + \mathbb{E} \Big[ (X - \mathbb{E}(X)) (g(\mathbb{E}(X)) - \mathbb{E}(g(X))) \Big] \\[12pt] &= \mathbb{E} \Big[ (X - \mathbb{E}(X)) (g(X) - g(\mathbb{E}(X))) \Big] + \Big[ g(\mathbb{E}(X)) - \mathbb{E}(g(X)) \Big] \mathbb{E} (X - \mathbb{E}(X)) \\[12pt] &= \mathbb{E} \Big[ (X - \mathbb{E}(X)) (g(X) - g(\mathbb{E}(X))) \Big] + \Big[ g(\mathbb{E}(X)) - \mathbb{E}(g(X)) \Big] \times 0 \\[12pt] &= \mathbb{E} \Big[ (X - \mathbb{E}(X)) (g(X) - g(\mathbb{E}(X))) \Big]. \\[6pt] \end{align}$$

(The fourth line in this working follows from pulling a constant term outside the expectation operator.) Now, if $g$ is weakly increasing, then we have $\text{sgn} [(x - y) (g(x) - g(y)] \geqslant 0$ for any values $x,y$, which implies that:

$$\begin{align} \mathbb{Cov}(X, g(X)) = \mathbb{E} \Big[ (X - \mathbb{E}(X)) (g(X) - g(\mathbb{E}(X))) \Big] \geqslant 0. \end{align}$$

"you depart from the algebra to state the middle step in prose, which makes it longer and more amorphous than it needs to be" Are you referring to the two sentences beginning with "Now an essential point..."? I wonder if you have any idea how many students will fail to understand if you just do the algebra without that verbal part? — Michael Hardy, Commented Sep 18, 2023 at 19:18
Yes, that is what I'm referring to. Personally, I think a simple postscript remark on the step is enough here. Normally I would write a proof for other professional mathematicians/statisticians, but for students, it is good for them to have to grapple with understanding steps like this and a simple postscript stating the essence of the step tells them what to look at. — Ben, Commented Sep 19, 2023 at 3:38
At best that depends heavily on what sort of student you're talking about. — Michael Hardy, Commented Sep 19, 2023 at 22:44
Are you under an impression that students will know that they do not know something without having it pointed out to them? Most of the time it doesn't work that way. — Michael Hardy, Commented Sep 22, 2023 at 0:10
I encourage students to go through working like this and ensure that they understand each step of the working; the operative test being, could they explain what the step did and why it is valid? Now, the student might look at the step (plus the accompanying postscript) and see why the term pulled out is a constant, and why it is okay to pull a constant out of an expectation, and be confident that they can explain this. Alternatively, the student might not understand why the pulled out term is a constant (or some other problem), in which case this ought to spur them to investigate further. — Ben, Commented Sep 22, 2023 at 1:33

Stack Exchange Network

Is there a simpler proof than mine for this obvious proposition about correlations?

2 Answers 2

Not the answer you're looking for? Browse other questions tagged
mathematical-statistics
correlation
covariance
or ask your own question.

Linked

Hot Network Questions

Is there a simpler proof than mine for this obvious proposition about correlations?

2 Answers 2

Not the answer you're looking for? Browse other questions tagged mathematical-statisticscorrelationcovariance or ask your own question.

Linked

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
mathematical-statistics
correlation
covariance
or ask your own question.