0
$\begingroup$

The R function chisq.test has an simulate.p.value argument that uses permutation for resampling. I know how to implement this kind of simulation. However, I was wondering if it's also possible to do the following:

Let's say we have a sample $X_1, X_2, ..., X_n$ that distributes across disjoint categories $A$, $B$ and $C$ (i.e., $X_i$ is either $A$, $B$, or $C$). Now I sample (with replacement) from my sample and compute the $\chi^2$-statistic. Both steps are repeated many times, and I obtain the simulated $p$-value by computing the proportion of simulated $\chi^2$-statistic exceeding the original $\chi^2$-statistic.

The $p$-values are very different compared to using simulate.p.value. Certainly, we are comparing different approaches, so there should be no surprise if there are differences in the simulated $p$-values. However, the magnitude of difference makes me wonder whether there is a flaw in my reasoning.

$\endgroup$
3
  • $\begingroup$ What was the original H0 your chisquare test was about? Equal proportions in each category maybe? And say you have N cases in your sample, do you mean you resample with replacement 100 cases each time? $\endgroup$
    – BenP
    Commented Jul 5 at 9:50
  • $\begingroup$ To expand on BenP's comment, the bootstrap you did targets the sampling distribution (you would expect 50% of test statistics to be either larger or smaller than the observed one there), whereas a P-value is calculated from the null distribution. See e.g. here for some more discussion. Resampling under the null (by default that all categories occur with equal probability) will likely reproduce something much closer to that P-value. $\endgroup$
    – PBulls
    Commented Jul 5 at 9:56
  • $\begingroup$ Thanks for your replies. Equal proportions is indeed H0. I see so the issue is that I compute p-values instead of just displaying the sampling distribution. Thank you! $\endgroup$ Commented Jul 5 at 10:19

0

Browse other questions tagged or ask your own question.