0
$\begingroup$

This question sounds ridiculous, let me clarify motivation:

Fisher information & Bayesian inference uncertainty seemed very cool to me because they can effectively tell you "how representative your sample is". But they do this via uncertainty in distribution/model parameters.

I was wondering if it is possible to do something similar for free form (or non-parametric) distributions (e.g. uncertainty in KDE). Or whether the assumptions made about a distribution in parametric statistics is somehow required to answer the question of "how representative is my sample"?

P.S. Unlike this poster I'm not trying to encode known uncertainties but to detect uncertainties in non-parameteric distributions (e.g. KDE).

$\endgroup$
9
  • 1
    $\begingroup$ I am not sure where to start with this one. Non-parametric is too generic. KDE, if it is like this, can be viewed as parametric - you have parameters of your kernel estimate. In Bayesian treatment these will have their own probability distributions, and thus will contribute to the uncertainty in posterior predictive. Is this what you are talking about? $\endgroup$
    – Cryo
    Commented Dec 29, 2023 at 9:49
  • $\begingroup$ @Cryo I think what you mentioned could be relevant to my question & I'm curious to hear about it! More generally though I'm asking this: is it possible to quantify/estimate how "representative" your sample is without (or with minimal, e.g. KDE) assumptions about the population distribution? $\endgroup$
    – profPlum
    Commented Jan 2 at 20:57
  • $\begingroup$ I kind of skipped you 'representative' term in the last reading, and now that I look at it, I am not sure what you mean. If you had a parametric distribution, in any parametrization you choose, which could even include mixtures, then you could ask how well your data fits that distribution. What is representativeness without a distribution? i.e. what would you like to find out? Perhaps it is something 'I don't need to get any more data, this is enough'. $\endgroup$
    – Cryo
    Commented Jan 2 at 22:32
  • 1
    $\begingroup$ You could try to bin your support, i.e. approximate the distribution with a finite histogram, represent this with a Dirichlet-Binomial distribution and then check how much uncertainty there is in your histogram given your data. $\endgroup$
    – Cryo
    Commented Jan 2 at 22:32
  • $\begingroup$ If you describe the measurements you have, could be in a simplified fashion, it would be easier to proceed $\endgroup$
    – Cryo
    Commented Jan 2 at 22:33

0