Is this a reasonable way to check the quality of simulated data in MCMC inference?

Ask Question

Asked 22 days ago

Modified 8 days ago

Viewed 18 times

I have a hierarchical Bayesian model that looks like this:

$\alpha_i \sim \mathcal{N}\left(\mu_\alpha, \sigma_\alpha\right) \tag{1}$ $\beta_i \sim \mathcal{N}\left(\mu_\beta, \sigma_\beta\right) \tag{2}$ $\gamma_i \sim \mathcal{N}\left(\mu_\gamma, \sigma_\gamma\right) \tag{3}$

So, my model has lower-level parameters $\alpha_i, \beta_i, \gamma_i$ for each participant $i$. Also, there are upper level parameters like $\mu_\alpha \ldots \sigma_\gamma$.

Assume I have specified an observation model so that I can fit these parameters with the NUTS sampler (let's say 4 chains, 1000 posterior samples each).

After fitting the model, I will get 4000 posterior samples for each parameter, let's denote them by $\alpha_i^{\text{est}}, \beta_i^{\text{est}}, \gamma_i^{\text{est}}, \mu_\alpha^{\text{est}}, \ldots \sigma_\gamma^{\text{est}}$.

Now, I want to simulate lower-level parameters using the upper-level parameters, and check if the simulated parameters are "close" to the estimated ones.

The way I have done this is described below:

Use $\mu_\alpha^{\text{est}}, \ldots \sigma_\gamma^{\text{est}}$ in $(1) -(3)$ to sample lower-level parameters, let's denote them by $\alpha_i^{\text{sim}}, \beta_i^{\text{sim}},\gamma_i^{\text{sim}}$.
Now, we can use PCA to visualize these parameters on a cartesian plane. We can fit the PCA map on $\alpha_i^{\text{est}}, \beta_i^{\text{est}}, \gamma_i^{\text{est}}$ which would be fit on an array of size (4000, 3). After fitting the PCA map, we can use it to reduce the dimension to (4000, 2).
We can also use the fitted PCA map to reduce the dimension of simulated parameters $\alpha_i^{\text{sim}}, \beta_i^{\text{sim}},\gamma_i^{\text{sim}}$ from (4000, 3) to (4000, 2).
Now, we can plot both of these and see if there's an overlap.

I want to know if this is an ok thing to do or am I violating any assumptions?

Note that, I can also plot the prior parameters by simply sampling from $(1)-(3)$ and then using the fitted PCA map to reduce them. Ideally, on the PCA plot, the estimated and simulated parameters would have a large overlap and they will both be within the area colored by prior parameters. (this is what I get on my data)

I want to know if this a reasonable way to assess the quality of simulated data. Any suggestions or guidance will be highly appreciated.

edited Jul 12 at 14:19

kjetil b halvorsen♦

81.4k32 gold badges199 silver badges648 bronze badges

asked Jun 28 at 14:26

chesslad

2111 silver badge8 bronze badges

1

$\begingroup$ Nothing prevents you from running this experiment, since there is no theoretical measure associated with it. Note that "MCMC inference" is a misnomer. $\endgroup$
– Xi'an
Commented Jun 28 at 15:26

Add a comment |

Stack Exchange Network

Is this a reasonable way to check the quality of simulated data in MCMC inference?

0

Browse other questions tagged
pca
simulation
markov-chain-montecarlo
synthetic-data
or ask your own question.

Hot Network Questions

Is this a reasonable way to check the quality of simulated data in MCMC inference?

0

Know someone who can answer? Share a link to this question via email, Twitter, or Facebook.

Browse other questions tagged pcasimulationmarkov-chain-montecarlosynthetic-data or ask your own question.

Related

Hot Network Questions

Browse other questions tagged
pca
simulation
markov-chain-montecarlo
synthetic-data
or ask your own question.