Skip to main content
Bounty Ended with 50 reputation awarded by Dan
added 1943 characters in body
Source Link
mjqxxxx
  • 42k
  • 3
  • 61
  • 111

It doesn't seem easy to me to even prove that the asymptotic behavior is $E(N) \sim \alpha \sqrt{N}$ for some constant $\alpha$. But we can certainly simulate the problem (and a couple of its variants) to try to find out. Suppose we look at three scenarios:

  • (A) Steps of fixed length $1/2$, each with a randomly chosen direction;
  • (B) Steps of fixed length $1/2$, each with a direction no more than $120^\circ$ from the previous step;
  • (C) (OP's problem) Each step goes to the nearest unvisited point, where points are generated by a Poisson process with average density $1$.

Now, we can simulate each scenario a large number of times for a few different values of $N$, and estimate $E(N)/\sqrt{N}$ for each. Here's what I find (taking $10^5$ or $10^6$ samples for each $N$ for the first two methods, and $10^3$ samples for the third and most computationally intensive method):

\begin{array}{|c|c|c|} \hline N & {\text {Method A}} & {\text {Method B}} & {\text {Method C}} \\ \hline 50 & 0.4440 \pm 0.0002 & 0.6845 \pm 0.0003 & 1.38 \pm 0.02 \\ 100 & 0.4436 \pm 0.0007 & 0.688 \pm 0.001 & 1.45 \pm 0.02 \\ 200 & 0.4432 \pm 0.0007 & 0.688 \pm 0.001 & 1.56 \pm 0.02 \\ 400 & 0.4432 \pm 0.0007 & 0.687 \pm 0.001 & 1.66 \pm 0.02 \\ 1000 & 0.4428 \pm 0.0007 & 0.687 \pm 0.001 & 1.78 \pm 0.03 \\ \hline \end{array}

The first two scenarios converge more or less immediately to a consistent final value for $E(N)/\sqrt{N}$, giving us confidence that we have the correct form for the large-$N$ behavior.

The third scenario, however, appears to have a coefficient that is still increasing with $N$. This suggests that the large-$N$ behavior for OP's problem may in fact be something more like $E(N) \sim \alpha \sqrt{N \log N}$. But I don't have even a heuristic justification for any particular functional form... many (including $\alpha\sqrt{N}$, if $\alpha$ converges slowly enough) are consistent with the data.


Update:

Expanding on the discussion in the comments. The root mean square of the distance traveled after $N$ steps can be expressed in terms of the autocorrelation expectations $\xi_k^{(n)} \equiv E[r_n \cdot r_{n+k}]$, where $r_n$ is the $n$-th step taken. Let $R_N = \sum_{i=1}^{N} r_i$ be the total displacement after $N$ steps; then $$ E[R_N\cdot R_N]=\sum_{i,j=1}^{N}E[r_i \cdot r_j]=\sum_{i=1}^{N}\xi_0^{(i)} + 2\sum_{k=1}^{N-1}\sum_{i=1}^{N-k}\xi_k^{(i)} \\ = N\xi_0 + 2\sum_{k=1}^{N-1}(N-k)\xi_k \\ = N\xi_0 + 2N\sum_{k=1}^{N-1}\xi_k - 2\sum_{k=1}^{N-1}k\xi_k, $$ where we made the assumption that the $\xi_k$ are time-invariant... i.e., $\xi_k^{(i)}$ does not depend on $i$. If $\xi_k$ goes to zero fast enough as $k\rightarrow\infty$ (for instance, if $\sum_{k=1}^{\infty}k\xi_k$ and $\sum_{k=1}^{\infty}\xi_k$ converge), then clearly $E[R_N \cdot R_N]$ is $\Theta(N)$ and the root mean square of the distance traveled is $\Theta(\sqrt{N})$. On the other hand, if, say, $\xi_k \sim \beta / k$, then the middle term dominates, and the RMS of the distance traveled is instead $\sim \sqrt{2\beta N \log N}$.

My estimates for $\xi_k$ based on $10^3$ simulated journeys of length $1000$ are the following:

\begin{array}{|c|c|} \hline k & \xi_k \\ \hline 1 & 0.1989 \pm 0.0005 \\ 2 & 0.1414 \pm 0.0005 \\ 4 & 0.0849 \pm 0.0005 \\ 8 & 0.0425 \pm 0.0005 \\ 16 & 0.0159 \pm 0.0005 \\ 32 & 0.0069 \pm 0.0006 \\ 64 & 0.0048 \pm 0.0006 \\ 128 & 0.0018 \pm 0.0006 \\ \hline \end{array}

These are certainly consistent with $\xi_k \sim \beta / k $, where $\beta \in (0.2, 0.3)$. In which case we would expect the distance traveled to grow as $\alpha \sqrt{N \log N}$, where $\alpha \in (0.63, 0.77)$. And, indeed, this reproduces the observed behavior very well... assuming $E(N)=\alpha\sqrt{N\log N}$, the direct estimates of $\alpha$ (from the first part of this answer) are $\in (0.675, 0.698)$.

It doesn't seem easy to me to even prove that the asymptotic behavior is $E(N) \sim \alpha \sqrt{N}$ for some constant $\alpha$. But we can certainly simulate the problem (and a couple of its variants) to try to find out. Suppose we look at three scenarios:

  • (A) Steps of fixed length $1/2$, each with a randomly chosen direction;
  • (B) Steps of fixed length $1/2$, each with a direction no more than $120^\circ$ from the previous step;
  • (C) (OP's problem) Each step goes to the nearest unvisited point, where points are generated by a Poisson process with average density $1$.

Now, we can simulate each scenario a large number of times for a few different values of $N$, and estimate $E(N)/\sqrt{N}$ for each. Here's what I find (taking $10^5$ or $10^6$ samples for each $N$ for the first two methods, and $10^3$ samples for the third and most computationally intensive method):

\begin{array}{|c|c|c|} \hline N & {\text {Method A}} & {\text {Method B}} & {\text {Method C}} \\ \hline 50 & 0.4440 \pm 0.0002 & 0.6845 \pm 0.0003 & 1.38 \pm 0.02 \\ 100 & 0.4436 \pm 0.0007 & 0.688 \pm 0.001 & 1.45 \pm 0.02 \\ 200 & 0.4432 \pm 0.0007 & 0.688 \pm 0.001 & 1.56 \pm 0.02 \\ 400 & 0.4432 \pm 0.0007 & 0.687 \pm 0.001 & 1.66 \pm 0.02 \\ 1000 & 0.4428 \pm 0.0007 & 0.687 \pm 0.001 & 1.78 \pm 0.03 \\ \hline \end{array}

The first two scenarios converge more or less immediately to a consistent final value for $E(N)/\sqrt{N}$, giving us confidence that we have the correct form for the large-$N$ behavior.

The third scenario, however, appears to have a coefficient that is still increasing with $N$. This suggests that the large-$N$ behavior for OP's problem may in fact be something more like $E(N) \sim \alpha \sqrt{N \log N}$. But I don't have even a heuristic justification for any particular functional form... many (including $\alpha\sqrt{N}$, if $\alpha$ converges slowly enough) are consistent with the data.

It doesn't seem easy to me to even prove that the asymptotic behavior is $E(N) \sim \alpha \sqrt{N}$ for some constant $\alpha$. But we can certainly simulate the problem (and a couple of its variants) to try to find out. Suppose we look at three scenarios:

  • (A) Steps of fixed length $1/2$, each with a randomly chosen direction;
  • (B) Steps of fixed length $1/2$, each with a direction no more than $120^\circ$ from the previous step;
  • (C) (OP's problem) Each step goes to the nearest unvisited point, where points are generated by a Poisson process with average density $1$.

Now, we can simulate each scenario a large number of times for a few different values of $N$, and estimate $E(N)/\sqrt{N}$ for each. Here's what I find (taking $10^5$ or $10^6$ samples for each $N$ for the first two methods, and $10^3$ samples for the third and most computationally intensive method):

\begin{array}{|c|c|c|} \hline N & {\text {Method A}} & {\text {Method B}} & {\text {Method C}} \\ \hline 50 & 0.4440 \pm 0.0002 & 0.6845 \pm 0.0003 & 1.38 \pm 0.02 \\ 100 & 0.4436 \pm 0.0007 & 0.688 \pm 0.001 & 1.45 \pm 0.02 \\ 200 & 0.4432 \pm 0.0007 & 0.688 \pm 0.001 & 1.56 \pm 0.02 \\ 400 & 0.4432 \pm 0.0007 & 0.687 \pm 0.001 & 1.66 \pm 0.02 \\ 1000 & 0.4428 \pm 0.0007 & 0.687 \pm 0.001 & 1.78 \pm 0.03 \\ \hline \end{array}

The first two scenarios converge more or less immediately to a consistent final value for $E(N)/\sqrt{N}$, giving us confidence that we have the correct form for the large-$N$ behavior.

The third scenario, however, appears to have a coefficient that is still increasing with $N$. This suggests that the large-$N$ behavior for OP's problem may in fact be something more like $E(N) \sim \alpha \sqrt{N \log N}$. But I don't have even a heuristic justification for any particular functional form... many (including $\alpha\sqrt{N}$, if $\alpha$ converges slowly enough) are consistent with the data.


Update:

Expanding on the discussion in the comments. The root mean square of the distance traveled after $N$ steps can be expressed in terms of the autocorrelation expectations $\xi_k^{(n)} \equiv E[r_n \cdot r_{n+k}]$, where $r_n$ is the $n$-th step taken. Let $R_N = \sum_{i=1}^{N} r_i$ be the total displacement after $N$ steps; then $$ E[R_N\cdot R_N]=\sum_{i,j=1}^{N}E[r_i \cdot r_j]=\sum_{i=1}^{N}\xi_0^{(i)} + 2\sum_{k=1}^{N-1}\sum_{i=1}^{N-k}\xi_k^{(i)} \\ = N\xi_0 + 2\sum_{k=1}^{N-1}(N-k)\xi_k \\ = N\xi_0 + 2N\sum_{k=1}^{N-1}\xi_k - 2\sum_{k=1}^{N-1}k\xi_k, $$ where we made the assumption that the $\xi_k$ are time-invariant... i.e., $\xi_k^{(i)}$ does not depend on $i$. If $\xi_k$ goes to zero fast enough as $k\rightarrow\infty$ (for instance, if $\sum_{k=1}^{\infty}k\xi_k$ and $\sum_{k=1}^{\infty}\xi_k$ converge), then clearly $E[R_N \cdot R_N]$ is $\Theta(N)$ and the root mean square of the distance traveled is $\Theta(\sqrt{N})$. On the other hand, if, say, $\xi_k \sim \beta / k$, then the middle term dominates, and the RMS of the distance traveled is instead $\sim \sqrt{2\beta N \log N}$.

My estimates for $\xi_k$ based on $10^3$ simulated journeys of length $1000$ are the following:

\begin{array}{|c|c|} \hline k & \xi_k \\ \hline 1 & 0.1989 \pm 0.0005 \\ 2 & 0.1414 \pm 0.0005 \\ 4 & 0.0849 \pm 0.0005 \\ 8 & 0.0425 \pm 0.0005 \\ 16 & 0.0159 \pm 0.0005 \\ 32 & 0.0069 \pm 0.0006 \\ 64 & 0.0048 \pm 0.0006 \\ 128 & 0.0018 \pm 0.0006 \\ \hline \end{array}

These are certainly consistent with $\xi_k \sim \beta / k $, where $\beta \in (0.2, 0.3)$. In which case we would expect the distance traveled to grow as $\alpha \sqrt{N \log N}$, where $\alpha \in (0.63, 0.77)$. And, indeed, this reproduces the observed behavior very well... assuming $E(N)=\alpha\sqrt{N\log N}$, the direct estimates of $\alpha$ (from the first part of this answer) are $\in (0.675, 0.698)$.

Source Link
mjqxxxx
  • 42k
  • 3
  • 61
  • 111

It doesn't seem easy to me to even prove that the asymptotic behavior is $E(N) \sim \alpha \sqrt{N}$ for some constant $\alpha$. But we can certainly simulate the problem (and a couple of its variants) to try to find out. Suppose we look at three scenarios:

  • (A) Steps of fixed length $1/2$, each with a randomly chosen direction;
  • (B) Steps of fixed length $1/2$, each with a direction no more than $120^\circ$ from the previous step;
  • (C) (OP's problem) Each step goes to the nearest unvisited point, where points are generated by a Poisson process with average density $1$.

Now, we can simulate each scenario a large number of times for a few different values of $N$, and estimate $E(N)/\sqrt{N}$ for each. Here's what I find (taking $10^5$ or $10^6$ samples for each $N$ for the first two methods, and $10^3$ samples for the third and most computationally intensive method):

\begin{array}{|c|c|c|} \hline N & {\text {Method A}} & {\text {Method B}} & {\text {Method C}} \\ \hline 50 & 0.4440 \pm 0.0002 & 0.6845 \pm 0.0003 & 1.38 \pm 0.02 \\ 100 & 0.4436 \pm 0.0007 & 0.688 \pm 0.001 & 1.45 \pm 0.02 \\ 200 & 0.4432 \pm 0.0007 & 0.688 \pm 0.001 & 1.56 \pm 0.02 \\ 400 & 0.4432 \pm 0.0007 & 0.687 \pm 0.001 & 1.66 \pm 0.02 \\ 1000 & 0.4428 \pm 0.0007 & 0.687 \pm 0.001 & 1.78 \pm 0.03 \\ \hline \end{array}

The first two scenarios converge more or less immediately to a consistent final value for $E(N)/\sqrt{N}$, giving us confidence that we have the correct form for the large-$N$ behavior.

The third scenario, however, appears to have a coefficient that is still increasing with $N$. This suggests that the large-$N$ behavior for OP's problem may in fact be something more like $E(N) \sim \alpha \sqrt{N \log N}$. But I don't have even a heuristic justification for any particular functional form... many (including $\alpha\sqrt{N}$, if $\alpha$ converges slowly enough) are consistent with the data.