0
$\begingroup$

Consider a case in which you want to know if two variables (X, Y) are independent conditional given a set (C) of other variables. A recent paper (Shah, R. D., and J. Peters. 2020. The hardness of conditional independence testing and the generalised covariance measure. The Annals of Statistics 48:1514-1538) has shown that a test statistic (the generalised covariance) is asymptotically distributed as a standard normal variate. To calculate this statistic, one requires two sets of residuals: (i) the response residuals of a regression of X on the set C of conditioning variables and (ii) the response residuals of a regression of Y on the set C of conditioning variables. This provides a way of answering the above question given a large sample size.

For small sample sizes, I proposed to conduct a permutation test by permuting (many times) one vector of residuals, and calculating the generalised covariance statistic of the pair of vectors of residuals; one the permuted residuals and the other the non-permuted residuals. This would generate an empirical sampling distribution for the generalised covariance statistic when the two vectors of residuals are independent. However, I received a comment that "the issue is that only the conditional mean [of X given C and of Y given C] is being subtracted from the residuals, so under the null, the residuals should have correlation 0, but would not necessarily be independent, and I think this is what the permutation test would rely on."

Question: Is this true? In particular, by permuting the first set of residuals, doesn't this insure that they are independent?

$\endgroup$

0