Could someone explain to me precisely what is meant by prior predictive check
, in Bayesian inference? In some documents, one uses observed data (“in which we compared the observed data to the predictions of the model”), in certain others one doesn't not use the observed data (“summarizing our knowledge prior to observing the data” ).
According to my knowledge of Bayesian (but which is far from expert), the first case rather reminds me of what is called the posterior predictive check
which itself seems quite clearly documented, and for which I believe that I understood the technique well. For the prior predictive check
, on the other hand, the way to proceed is still not clear to me.
So, so as not to speak in a vacuum, I give a (slightly artificial) example below.
Let's imagine that I am trying to model the number of vehicles passing a given road point in one minute, for which it seems reasonable to me to use a Poisson
distribution with parameter $\lambda$. I learned that we most often use a Gamma
distribution as prior
for $\lambda$.. As in similar situations, the average of vehicles passing in 1 minute is around $20$, it seems to me that I should use a Gamma
( $\alpha$, $\beta$ ) distribution with $\alpha$/$\beta$ ~ $20$. Except that I can take as a couple ($\alpha$, $\beta$) the couple (2, 0.1), or (20,1), or many others…
My current understanding of the prior predictive check
therefore leads me to proceed as follows:
- I decide the number of observations of my
Poisson
distribution that I will make, let's say $n = 100$. - I give myself two values $\alpha$ and $\beta$ such that $\alpha$/$\beta$ $=20$.
- I sample a value $\lambda_i$ from
Gamma
($\alpha$, $\beta$). - With this $\lambda_i$, I sample $n$ values from
Poisson
($\lambda_i$) and note the maximum $M_i$ of the $n$ sampled values. - I repeat $N$ times (for example $1000$ times) points 3) and 4).
- I plot a histogram of $N$ maximum values $M_i$.
- I create several histograms for different couples ($\alpha$, $\beta$).
The result I obtained is given by the plot below (I am not giving the entire program so as not to overload the post):
A discussion on the experiment to be carried out concludes that it is impossible to assume that several hundred vehicles can pass at the given point in 1 minute (the couple $(0.2, 0.01)$ must be eliminated for excessive maximum values); on the other hand it sometimes happens that a hundred, or a little more, vehicles can pass (the couples $(20, 1)$ and $(200, 10)$ must be eliminated because the maximum values are too low).
Finally, I opt for the prior Gamma
$(2, 0.1)$, which appears the most adequate.
Does this reasoning really constitute a prior predictive check
? Is this the usual way of reasoning?
And if not, if I was completely wrong in detailing this example, could you give me a concrete example of how to do a prior predictive check
?
Any information to resolve my doubts will be welcome!