There are $m$ normally distributed, independent random variables $N_1, \ldots, N_m$ with distinct means $\mu_1, \ldots \mu_m$ and standard deviations $\sigma_1, \ldots, \sigma_m$. Then, we get a permutation of the numbers $\{1, \ldots, m\}$. How can we efficiently compute, numerically, the (log) probability of observing the random variables in same ordering as this permutation?
An example:
- we have four independent random variables $N_1, N_2, N_3, N_4$, all with different means and variances.
- We are given the permutation (3, 1, 2, 4).
- What's $\Pr(N_3 > N_1 > N_2 > N_4)$?
A closed-form solution is not necessary, but computing the solution using an efficient algorithm with good accuracy is. Also, it's probably necessary to compute a log probability due to the fact that when the number of variables becomes large, computing the actual probability will result in a floating-point underflow.
Some starting points, perhaps...
The most direct way to compute this value, using the example above, is evaluating one of the following integrals, which I believe are equivalent:
$$ \int_{-\infty}^\infty \int_{n_4}^\infty \int_{n_2}^\infty \int_{n_1}^\infty p(n_1)p(n_2)p(n_3)p(n_4)\ dn_3 dn_1 dn_2 dn_4 $$
$$ \int_{-\infty}^\infty \int_{-\infty}^{n_3} \int_{-\infty}^{n_1} \int_{-\infty}^{n_2} p(n_1)p(n_2)p(n_3)p(n_4)\ dn_4 dn_2 dn_1 dn_3 $$
Where $p(n_i)$ is the density function of the variable $N_i$. However, when I tried to implement this numerically, it is inefficient, prone to inaccuracy, and runs into underflow errors when the number of variables gets large. If you think you can compute this integral in an acceptable way, please do post your answer!
From one of the answers below, we observe that it's possible to compute $\Pr(N_3 > N_1 > N_2 > N_4)$ directly by evaluating a multivariate normal CDF of dimension $(m-1)$, or 3 in this case. However, this is still nontrivial (though there may be libraries for it), and will underflow for many variables.
Perhaps we can divide the probability up as follows:
$$\Pr(N_3 > N_1 > N_2 > N_4) = $$ $$\Pr(N_3 > N_1 \mid N_1 > N_2, N_2 > N_4 )\Pr(N_1 > N_2 \mid N_2 > N_4 )\Pr(N_2 > N_4)$$
Being able to compute the probabilities of each part directly would make it very easy to compute the log probability simply by adding. We can compute the conditional probabilities separately using the MVN CDF method, which could help if the product might underflow.
Another observation: the $m!$ possible probabilities corresponding to the different permutations must sum to 1. Perhaps there is a way to compute the probabilities iteratively or using dynamic programming: i.e.: $(N_2 > N_3)$, an ordering over a pair, has some fixed probability, which is further divided into three values by the three possible places to insert $N_1$ into the ordering, further divided into the four values by the possible places to insert $N_3$. This is semantically equivalent to the conditional probabilities above but it might be easier to think of it this way.
Any math wizards have suggestions on how to solve this problem? I would greatly appreciate any ideas!