3
$\begingroup$

I'd like to write a formula for a correlation coefficient that involves sorting continuous observations, both within a variable and by another variable. For example, I'd like to say that $r$ is computed between $X$ sorted by $-Y$, and $Z$ sorted by itself, all ascending. My naïve notation for these terms is $sort(X|-Y)$ and $sort(Z)$, specifying separately that both sorts are in ascending order. But, is there a standard statistical or mathematical notation I can use?

For what it's worth, I've considered using the ranks of variables as indices in the actual correlation formula. That seems awkward, though, and still wouldn't let me refer to each variable, individually and symbolically, in the text. Thanks!

$\endgroup$
0

2 Answers 2

1
$\begingroup$

Maybe the most standard option would be: $$X'_i = \pi_2((-Y,X)_{(i)}),\ Z'_i = Z_{(i)}, \ r = \text{corr}(X',Z')$$ where ${(i)}$ is a standard subscript for the $i^{th}$ sorted element and $\pi_2$ is standard for projection onto the second axis.

My reaction to this would be: "I guess they want $(-Y,X)$ sorted lexicographically, but why?"

$\endgroup$
4
  • $\begingroup$ From your phrasing "most standard" may I infer that you believe there is no standard notation? $\endgroup$
    – virtuolie
    Commented Jan 30, 2023 at 18:19
  • $\begingroup$ Correct, I see no standard notation — but if there is some context which makes sense of and justifies the odd way of relating the variables, maybe in that context there would be a more standard way of talking about it. $\endgroup$
    – Matt F.
    Commented Jan 30, 2023 at 18:23
  • $\begingroup$ Unfortunately, the application is as unorthodox as the notation appears to be. The short answer is that I want to manipulate relative rank order while changing the population correlation between the manipulated variables as little as possible. If Z = Yhat (Y predicted from X), then the above procedure should have the result that E(r[XY]) = E(-r[XZ]) and E(r[XY] - rho[XY]) = -E(-r[XZ] - rho[XY]). In other words, one recalculates the same population correlation but with reversed net randomness in the estimate. $\endgroup$
    – virtuolie
    Commented Jan 30, 2023 at 18:38
  • $\begingroup$ That actually makes more sense than I expected; if it helps me think of a better notation, I'll let you know. $\endgroup$
    – Matt F.
    Commented Jan 30, 2023 at 19:14
1
$\begingroup$

According to Wikipedia: "In statistics, the concept of a concomitant, also called the induced order statistic, arises when one sorts the members of a random sample according to corresponding values of another random sample.

"Let $(X_i, Y_i), i = 1, . . ., n$ be a random sample from a bivariate distribution. If the sample is ordered by the $X_i$, then the $Y$-variate associated with $X_{r:n}$ will be denoted by $Y_{[r:n]}$ and termed the concomitant of the $r$th order statistic."

$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.