1
$\begingroup$

Suppose I have two pairs of distributions: distributions A and B in Pair 1, distribution C and D in Pair 2. There are non-parametric tests to determine if there is evidence to say that the distributions in each pair are different. However, suppose that I found significant evidence that distributions A and B are different from each other, and distributions C and D are different from each other, are there tests I can use to determine which pair is "more" different?

Or is this even a valid question?

$\endgroup$
2
  • 1
    $\begingroup$ Tests do not apply to distributions: they work on data. Are you trying to imply you have independent random samples of each distribution? And even if so, what would you mean by "more" different? There are a huge number of ways to measure differences between distributions, so we need you to explain your sense of "difference." $\endgroup$
    – whuber
    Commented Jul 9, 2022 at 20:49
  • $\begingroup$ Thanks. In asking this question I was actually trying to solve this question stats.stackexchange.com/questions/581459/tree-association-index. What I had in mind was that, if I can determine which tree has an association index distribution that is more different to its corresponding null distribution, then I can say that it has a stronger degree of clustering in tip label. $\endgroup$ Commented Jul 10, 2022 at 2:45

1 Answer 1

1
$\begingroup$

I think you might be looking for some sort of divergence metrics. KL divergence is a popular one, even though it is not symmetric, which might be desirable in your case. Other divergences can be found here, and some of them are symmetric. Also, you mentioned non-parametric tests: have you considered using the KS-statistic? A larger statistic should indicate that the two distributions are more different.

I am not sure if there is a way to "test" if this difference between the two differences is significant. It seems to me that whatever you do you will have only one instance of this difference, but it seems to be reasonable to draw some conclusion from it (as long as your distribution comes from a suitable sample size).

$\endgroup$
3
  • $\begingroup$ Thanks for the answer. Yes I have considered using the KS-statistic, and intuitively I think it does make sense that a larger statistic would indicate that the two distributions are more different - but then this is really comparing two p-values which isn't valid I believe. But I think some divergence metric as you suggested might be a good idea. $\endgroup$ Commented Jul 9, 2022 at 20:17
  • $\begingroup$ The KS-statistic is just that: a statistic. It is one of the many metrics (distances) that can be defined between empirical distributions (that is, of data) or, indeed, between any distributions. $\endgroup$
    – whuber
    Commented Jul 9, 2022 at 20:51
  • 1
    $\begingroup$ @david-young Comparing KS-distances is not the same as comparing p-values in general; p-values change with sample size (they tend to get smaller as n grows unless H0 is exactly true) but the expected KS distance doesn't; it converges to the equivalent population distance. $\endgroup$
    – Glen_b
    Commented Jul 10, 2022 at 22:38

Not the answer you're looking for? Browse other questions tagged or ask your own question.