Statistical notation question: How do I represent sorting variables, individually and by each other, symbolically?

Question

I'd like to write a formula for a correlation coefficient that involves sorting continuous observations, both within a variable and by another variable. For example, I'd like to say that $r$ is computed between $X$ sorted by $-Y$, and $Z$ sorted by itself, all ascending. My naïve notation for these terms is $sort(X|-Y)$ and $sort(Z)$, specifying separately that both sorts are in ascending order. But, is there a standard statistical or mathematical notation I can use?

For what it's worth, I've considered using the ranks of variables as indices in the actual correlation formula. That seems awkward, though, and still wouldn't let me refer to each variable, individually and symbolically, in the text. Thanks!

Matt F. · Accepted Answer · 2023-01-27 14:50:41Z

1

Maybe the most standard option would be: $$X'_i = \pi_2((-Y,X)_{(i)}),\ Z'_i = Z_{(i)}, \ r = \text{corr}(X',Z')$$ where ${(i)}$ is a standard subscript for the $i^{th}$ sorted element and $\pi_2$ is standard for projection onto the second axis.

My reaction to this would be: "I guess they want $(-Y,X)$ sorted lexicographically, but why?"

answered Jan 27, 2023 at 14:50

Matt F.

5,34213 silver badges38 bronze badges

$\begingroup$ From your phrasing "most standard" may I infer that you believe there is no standard notation? $\endgroup$
– virtuolie
Commented Jan 30, 2023 at 18:19
$\begingroup$ Correct, I see no standard notation — but if there is some context which makes sense of and justifies the odd way of relating the variables, maybe in that context there would be a more standard way of talking about it. $\endgroup$
– Matt F.
Commented Jan 30, 2023 at 18:23
$\begingroup$ Unfortunately, the application is as unorthodox as the notation appears to be. The short answer is that I want to manipulate relative rank order while changing the population correlation between the manipulated variables as little as possible. If Z = Yhat (Y predicted from X), then the above procedure should have the result that E(r[XY]) = E(-r[XZ]) and E(r[XY] - rho[XY]) = -E(-r[XZ] - rho[XY]). In other words, one recalculates the same population correlation but with reversed net randomness in the estimate. $\endgroup$
– virtuolie
Commented Jan 30, 2023 at 18:38
$\begingroup$ That actually makes more sense than I expected; if it helps me think of a better notation, I'll let you know. $\endgroup$
– Matt F.
Commented Jan 30, 2023 at 19:14

Add a comment |

virtuolie · Accepted Answer · 2023-04-03 19:51:10Z

1

According to Wikipedia: "In statistics, the concept of a concomitant, also called the induced order statistic, arises when one sorts the members of a random sample according to corresponding values of another random sample.

"Let $(X_i, Y_i), i = 1, . . ., n$ be a random sample from a bivariate distribution. If the sample is ordered by the $X_i$, then the $Y$-variate associated with $X_{r:n}$ will be denoted by $Y_{[r:n]}$ and termed the concomitant of the $r$th order statistic."

answered Apr 3, 2023 at 19:51

virtuolie

6424 silver badges11 bronze badges

Add a comment |

Stack Exchange Network

Statistical notation question: How do I represent sorting variables, individually and by each other, symbolically?

2 Answers 2

Not the answer you're looking for? Browse other questions tagged
ordinal-data
ranking
notation
order-statistics
sorting
or ask your own question.

Hot Network Questions

Statistical notation question: How do I represent sorting variables, individually and by each other, symbolically?

2 Answers 2

Not the answer you're looking for? Browse other questions tagged ordinal-datarankingnotationorder-statisticssorting or ask your own question.

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
ordinal-data
ranking
notation
order-statistics
sorting
or ask your own question.