Why is the expected average displacement of a random walk of N steps not $\sqrt N$?

Question

Let $D_N$ be the expected average of the displacement of a random walk on $\mathbb Z$ from the origin, where $N$ is the number of steps, each of which is either $-1$ or $1$.

We take the definition of the expected average of a variable $A$ to be $\sum_{i=1}^N a_ip_i$ where $A$ can take any value $a_1,a_2,\ldots,a_N$ with probability $p_1,p_2,\ldots,p_N$.

In one step, ${D_1}^2 = \frac121^2+\frac12(-1)^2=1$.

If we know $D_{N-1}$, then ${D_N}^2 = \frac12(D_{N-1}+1)^2 + \frac12(D_{N-1}-1)^2 = {D_{N-1}}^2 + 1$.

This means that ${D_N}^2 = N$ or $D_N = \sqrt N$.

However, if we look at the distribution of end paths after $N$ steps, we can represent the expected average $D_N = \frac{\binom{0}{N}d_1+\binom{1}{N}d_2 + \cdots + \binom{N}{N}d_N}{2^N}$ where $d_1,d_2,\ldots,d_{N}$ are the actual displacements from the origin.

ie:

$$D_1 = \frac{1}{2}\left[\binom011+\binom111\right] = 0.5 \\ D_2 = \frac{1}{4}\left[\binom022+\binom120+\binom222\right] = 1.0 \\ D_3 = \frac{1}{8}\left[\binom033+\binom131+\binom231+\binom333\right] = 1.5 \\ \vdots$$

All of which are different from $\sqrt N$.

I know that $\sqrt N$ is really only an approximation of $D_N$ and not an exact value, so where does the proof that $D_N$ is equal to $\sqrt N$ go wrong?

Note that your first recurrence relation $D_{N}^2-D_{N-1}^2=1$ isn't consistent with the numbers you give subsequently. So I suspect that the first recurrence relation is wrong. — Semiclassical, Commented Aug 20, 2014 at 22:21
@Semiclassical Could you point out what you call the first recurrence relation? I know that the numbers I give do not follow that rule, so I am asking what part of the proof is wrong — Couchy, Commented Aug 20, 2014 at 22:27
I'd concur with Did's answer below: you're conflating two different quantities, the expected displacement v. the expected mean-squared displacement. — Semiclassical, Commented Aug 20, 2014 at 22:29
@Semiclassical So is it not true that $E(D_{N-1}) {}^+_-1 = D_N$ where $E(D_N)$ is the expected average of $D_N$, and $D_N$ is the actual value? — Couchy, Commented Aug 20, 2014 at 23:10
See? What can be the meaning of an identity such as $E(D_{N-1})\pm1=D_N$, mixing the random $D_N$ with the deterministic $E(D_{N-1})$ (and, rather more subtly, forgetting the effect of the origin, see my post)? — Did, Commented Aug 20, 2014 at 23:12

Did · Accepted Answer · 2015-08-17 10:30:33Z

These are the recursions for the mean squared distances and, of course, square roots of mean squared distances are not mean distances since, in general $E(D^2)\ne E(D)^2$. The recursions for the mean distances $E(D)$ are more complicated.

Edit: Let us try to separate clearly the different notions involved, considering the (random) position $X_N$ after $N$ steps and the (random) distance) $D_N=|X_N|$ after $N$ steps (this may not be what you mean by $D_N$ in the question but, precisely, the confusion here might stem from the fact that what you mean by $D_N$ is not quite clear...).

Then, on the one hand, $X_{N+1}=X_N+Z_{N+1}$ where $Z_{N+1}=\pm1$ is centered and independent of $X_N$ hence $X_{N+1}^2=X_N^2+2X_nZ_{N+1}+1$ yields $E(X_{N+1}^2)=E(X_N^2)+1$, thus, $E(X_N^2)=N$ for every $N$, that is, the square root of the mean squared distance after $N$ steps is exactly $\sqrt{N}$.

On the other hand, to express $D_{N+1}=|X_N+Z_{N+1}|$ in terms of $D_N$ is more involved since, for every integer $x\gt0$, $E(D_{N+1}\mid D_N=x)=\frac12(x+1)+\frac12(x-1)=x$, while $E(D_{N+1}\mid D_N=0)=1$. Thus, $$E(D_{N+1})=E(D_N)+P(D_N=0)=E(D_N)+P(X_N=0),$$ which, together with $E(D_1)=1$, leads to $$E(D_{2N+2})=E(D_{2N+1})=1+\sum_{k=1}^NP(X_{2k}=0)=\sum_{k=0}^N\frac1{2^{2k}}{2k\choose k}.$$ Stirling's approximation then yields $$\frac1{2^{2k}}{2k\choose k}\sim\frac1{\sqrt{\pi k}},$$ hence $$E(D_N)\sim\sqrt{\frac2\pi}\cdot\sqrt{N}.$$

To sum up, $\sqrt{E(D_N^2)}$ and $E(D_N)$ both scale as $\sqrt{N}$ when $N\to\infty$ but for some different prefactors, and exactly for $\sqrt{E(D_N^2)}$ but only asymptotically for $E(D_N)$.

Finally, the value of the pre-factor for $E(D_N)$ can be understood if one notes that, by the most basic version of the central limit theorem, $D_N/\sqrt{N}$ converges in distribution to $|G|$ where $G$ is standard normal and that $E(|G|)$ is indeed... $\sqrt{2/\pi}$. More generally, for every nonnegative $k$, when $N\to\infty$, $$E(D_N^k)\sim E(|G|^k)\cdot N^{k/2}.$$

Or are you essentially saying that $\sqrt {D_N^2}$ should be equal to $D_{N-1}{}^+_-1$, and not $\sqrt {D_{N-1}+1}$? — Couchy, Commented Aug 20, 2014 at 22:58
Rereading your post, it seems that you are confusing deterministic and random quantities. See Edit for more detailed explanations. — Did, Commented Aug 20, 2014 at 23:09
I don't understand this passage: $E(X_{N+1}^2)=E(X_N^2)+1$, thus $E(X_N^2)=N$. Could you please elaborate? — Tropilio, Commented Jul 13, 2021 at 14:50

Filip · Accepted Answer · 2024-05-18 20:37:29Z

In an arbitrary dimension d:

Let $\vec{R}$ be the end-to-end distance vector of a random walk of fixed step length $|\vec{r}_i| = l$. $\vec{R}$ can then be expressed as $\displaystyle \vec{R} = \sum_{i=1}^N \vec{r}_i$, where $\vec{r}_i$ is the vector of the $i$-th step. The Root-Mean-Square End-to-End Distance is given by $\textrm{RMS}=\sqrt { \langle R^2 \rangle }$. Since the steps are mutually independent, the covariance of two steps $\vec{r}_i$ and $\vec{r}_j$ is zero if $i\neq j$ and $\textrm{Cov}(\vec{r}_i, \ \vec{r}_j)= \textrm{Var}(\vec{r}_i)$ if $i=j$. The variance of $ \vec{r}_i$ can be expressed as $ \textrm{Var}(\vec{r}_i)= \langle \vec{r}_i \cdot \vec{r}_i \rangle - \langle \vec{r}_i \rangle^2$. Due to symmetry $\langle \vec{r}_i \rangle=\vec{0}$ and therefore the variance of of $ \vec{r}_i$ is simply $ \textrm{Var}(\vec{r}_i)= \langle \vec{r}_i \cdot \vec{r}_i \rangle = |\vec{r}_i|^2 = l^2$. Altogether, the covariance of $\vec{r}_i$ and $\vec{r}_j$ equals $\textrm{Cov}(\vec{r}_i, \ \vec{r}_j)=\delta_{ij}l^2$. The covariance of $\vec{r}_i$ and $\vec{r}_j$ can also be expressed as $\textrm{Cov}(\vec{r}_i, \ \vec{r}_j) = \langle \vec{r}_i \cdot \vec{r}_j \rangle - \langle \vec{r}_i \rangle \cdot \langle \vec{r}_j \rangle$. Combining the two different expressions for the covariance and using that $\langle \vec{r}_i \rangle=0$, results in $\langle \vec{r}_i \cdot \vec{r}_j \rangle =\delta_{ij}l^2$. This result can be used to determine the RMS:

$$\textrm{RMS}=\sqrt { \langle R^2 \rangle } = \sqrt { \langle \vec{R} \cdot \vec{R} \rangle } =\sqrt { \big\langle \sum_{i=1}^N \vec{r}_i \cdot \sum_{j=1}^N \vec{r}_j \big\rangle } =\sqrt { \sum_{i=1}^N \sum_{j=1}^N \langle \vec{r}_i \cdot \vec{r}_j \rangle }= $$ $$=\sqrt { \sum_{i=1}^N \sum_{j=1}^N l^2 \delta_{ij} + 0^2}=\sqrt { \sum_{i=1}^N l^2}=\sqrt { N l^2}=l\sqrt { N }$$

Let $Z_i$ denote the $i$-th coordinate of the end-to-end distance vector $\vec{R}$ after $N$ steps, and let $X_i$ and $Y_i$ denote the number of steps taken in the $i$-th dimension in the positive and negative direction respectively. Then the set of random variables $\{X_i, Y_i\}_{i=1}^d$ follows a multinomial distribution with parameters $N$ and $\displaystyle p_i=\frac{N}{2d}$. For sufficiently large values of $N$, $\{X_i, Y_i\}_{i=1}^d$ are approximately iid (independent and identically distributed) Poisson random variables with parameters $\displaystyle \lambda_i = \frac{N}{2d}$. For $\lambda > 20$, i.e. $N>40d$, $\textrm{Po}(\lambda) \sim \textrm{N}(\lambda, \lambda)$. $ Z_i = l(X_i - Y_i)$ and therefore $\displaystyle Z_i \sim \textrm{N}(l(\lambda - \lambda), l^2(\lambda+\lambda))=\textrm{N}(0, 2l^2\lambda)=\textrm{N}\left(0, \frac{l^2N}{d}\right)$.

$\displaystyle \langle R \rangle = \langle \sqrt{R^2} \rangle = \left\langle \sqrt{ \sum_{i=1}^d Z_i^2} \right\rangle$. The square root of a sum of $k$ independent $\textrm{N}(0, 1)$-distributed random variables is distributed according to the chi distribution, $\chi_k$. Therefore $\displaystyle \sqrt{ \sum_{i=1}^d \frac{dZ_i^2}{l^2N}}$ is approximately $\chi_d$-distributed for large values of $N$. The expected value of a $\chi_k$-distributed random variable is $\displaystyle \sqrt{2} \frac{ \Gamma \left(\frac{k+1}{2}\right) }{\Gamma \left( \frac{k}{2}\right)}$.

Hence $\displaystyle \langle R \rangle =\left\langle\sqrt{ \sum_{i=1}^d Z_i^2}\right\rangle =\left\langle l \sqrt{\frac{N}{d}} \sqrt{ \sum_{i=1}^d \frac{dZ_i^2}{l^2N} }\right\rangle = l \sqrt{\frac{2N}{d} }\frac{ \Gamma \left(\frac{d+1}{2}\right) }{\Gamma \left( \frac{d}{2}\right)}$.

Stack Exchange Network

Why is the expected average displacement of a random walk of N steps not $\sqrt N$?

2 Answers 2

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged
probability
random-walk
.

Hot Network Questions

Why is the expected average displacement of a random walk of N steps not $\sqrt N$?

2 Answers 2

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged probabilityrandom-walk.

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
probability
random-walk
.