2
$\begingroup$

Say I uniformly at random distribute $x = n^3$ (independent identically distributed) points in a ball of radius $r=1$ in $\mathbb{R}^3$.

What can be said about the expected maximum, minimum, and mean distance between the nearest neighbours of any given point, asymptotically as $n\to\infty$?

Trying to simulate these, I get:

up to 64³ points averaged across 500 runs

Don't mind the "Theoretical Mean" value too much, that's really just for the slope. It's $\frac{1}{x} = \frac{1}{n^3}$ where $n$ is the number of points.

I'm getting:

Min Distance Fit:
Slope: -2.0084
Intercept: 1.0666
R²: 0.9336
Equation: y = 1.0666 * x^(-2.0084)

Mean Distance Fit:
Slope: -1.0253
Intercept: 0.9858
R²: 0.9987
Equation: y = 0.9858 * x^(-1.0253)

Max Distance Fit:
Slope: -0.8986
Intercept: 1.7097
R²: 0.9918
Equation: y = 1.7097 * x^(-0.8986)

So those are pretty good fits, but is there anything known about this in exact terms? I know for just two points the answer for the mean ought to be $\frac{35}{36}\approx0.9722$ according to this answer https://math.stackexchange.com/a/167983/49989
The slope of the minimum is conspicuously close to -2.

I also saw there is this https://en.wikipedia.org/wiki/Complete_spatial_randomness which, if I did it right, implies that for completely uniformly distributed points of a given density I end up with

$$ P\left(r,\rho,N\right)=\frac{3^{1-N}\left(4\pi\rho\right)^N r^{3 N - 1}e^{-\frac{4}{3}\pi\rho r^3}}{\left(N - 1\right)!} $$

for the probability of the distance $r$ to the $N^\text{th}$ nearest neighbour given a density $\rho$.

I guess I could just substitute $\rho = \frac{3 n}{4 \pi R^3}$ for $n$ points in a volume of radius $R$ which gives me

$$ \frac{3 e^{-n\frac{r^3}{R^3}}\left(n\frac{r^3}{R^3}\right)^N}{r\left(N-1\right)!} $$

for which the expectation (over the domain $0\leq r\leq\infty$) should end up being

$$ \frac{R\ \Gamma\left(N+\frac{1}{3}\right)}{n^\frac{1}{3} \Gamma\left(N\right)} $$

and that does seem to be reasonably close to the result of the mean of the nearest neighbour, though it doesn't really agree: It predicts that, for a sphere of radius $R=1$ and $x=n^3$ points I get

$$ \frac{\Gamma\left(\frac{4}{3}\right)}{x}\approx\frac{\text{0.89298}}{x} $$

rather than

$$ \approx\frac{\text{0.9858}}{x^\text{1.0253}} $$

but I'm not sure whether that's within the error of the simulation. It sure doesn't look like it on the plot: Variance quickly becomes quite small for larger sets of points (at least for the mean fit) and the line is clearly slightly steeper than this analytical value.

$\endgroup$
3
  • $\begingroup$ Could you please clarify what "distance between nearest neighbors" means? Would it be, for instance, distance to the nearest neighbor? Or distance between the two nearest neighbors? Or perhaps even average distance among the $k$ nearest neighbors for a fixed $k$? And would "given point" mean one of the $x$ points or would it be a specified coordinate in the interior of the ball? $\endgroup$
    – whuber
    Commented Jul 8 at 18:37
  • $\begingroup$ Distance to the nearest neighbour, although in the minimum case that would presumably be equivalent to distance between the two nearest neighbours. $\endgroup$
    – kram1032
    Commented Jul 8 at 18:39
  • $\begingroup$ Will you give formulas for the minimum, maximum and mean that you seek in terms of the $x_i$? Do the formulas also involve a reference point $P$? Do the formulas quantify over all possibilities for $P$ in the unit ball? $\endgroup$
    – Matt F.
    Commented Jul 12 at 6:00

1 Answer 1

2
$\begingroup$

It helps to look at this abstractly.

Suppose we're in a metric space at a fixed point $x_0$ and for any possible radius $r\ge 0$ the chance that a single random point lies within distance $r$ of $x_0$ is $\Pr(B(x_0, r)).$ Consequently, by the axioms of probability, the chance that the nearest neighbor of $x_0$ among $N$ iid random points is further than $r$ away is

$$\left(1 - \Pr(B(x_0,r))\right)^N.$$

This is the survival function of the distance to the nearest neighbor.

In your particular setup, the rest is so straightforward that we can leave it to you to complete the work, because

  • $\Pr(B(x_0, r)) = r^3$ for all $r\le 1 - |x_0|$ (the distance to the edge of the unit ball).

  • $(1 - r^3)^N = \exp(-Nr^3) + O(Nr^6).$

The latter makes it clear that in $\mathbb R^3$ (with its usual Euclidean metric) you want to work in terms of $\rho_N = r/\sqrt[3]{N}$ and that the nearest-neighbor distance will have asymptotically an Exponential distribution in terms of $\rho_N^3,$ converging at rate $O(1/N).$

$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.