2
$\begingroup$

Let $X$ be a uniform distribution of inputs to be used for sampling. Let $f(x)$ be an expensive function. If we take samples from $X$ and give them as input to $f$ we get outputs $y_1, y_2, \ldots, y_n$ from some mystery distribution $Y$ with finite support. I wish to calculate the minimum of $Y$ with, say, 95% confidence. Alternatively, something like, say, the 5th percentile with 95% confidence.

The two methods I know of are Chebyshev's inequality and Wilks' method. For the Chebyshev's inequality, I don't know which variant to use. This one shown on Wikipedia appears to be the closest to what I want, but it is two-tailed. Wilks' method seems more promising, but it appears to be very conservative.

What method would you recommend?

$\endgroup$
11
  • $\begingroup$ Is $f(x)$ a monotonic function? If so, the answer is available immediately without any approximation. In any case, you have the whole distribution of $y_i$, not just sample mean and variance, so appoximations based only on mean and variance are not optimal. $\endgroup$ Commented Jun 5, 2021 at 4:37
  • $\begingroup$ $f(x)$ is not monotonic. And yes, I guess I have the entire sample distribution of $Y$. Does that make things better? $\endgroup$
    – kpjoshi
    Commented Jun 6, 2021 at 13:40
  • $\begingroup$ Are you really after quantiles (as in your question) or are you after tail probabilities (as in the Wikipedia method you link to). If you do want quantiles, then which ones exactly? If you want a tail probability, then which one exactly? In either case, having the entire sample distribution helps considerably. $\endgroup$ Commented Jun 7, 2021 at 0:20
  • 1
    $\begingroup$ Mmm. That's a very different (and much harder) problem to that which your question seems to be asking. I suggest editing your question to make your aims clearer. $\endgroup$ Commented Jun 7, 2021 at 0:57
  • 1
    $\begingroup$ Because you explicitly posit a nonparametric setting (all you assume is "some mystery distribution"), no "statistical formula" is available. You can make progress only by making restrictive assumptions about that distribution. An example would be to suppose it is a shifted Lognormal, for instance. (This is just an example, not a suggestion!) $\endgroup$
    – whuber
    Commented Jun 8, 2021 at 13:28

0