Average distance from origin in a random walk on the integer number line

Question

In a random integer walk along a number line (each step 0.5 probability of moving right and 0.5 probability of moving left), what is the average distance from the origin during the walk?

Other questions on here ask something that sounds like this, but I'm not sure if they are talking about the average during the walk, or just your expected distance at the end of the walk.

Follow-up question: if you have the answer for the end of a walk of length $n$ (call it $w(n)$), could you get the during walk average by doing the average $\sum_1^n{w(i)}/n$, or would that not work because the distance at each step is not independent of the distance of the previous steps?

user217285 · Accepted Answer · 2015-12-26 05:18:51Z

Assume without loss of generality that the random walk starts at 0. Let $X_i$ be a random variable with $P(X_i = 1) = P(X_i = -1) = 1/2$, so that the random walk is represented by the random variable $S_n = X_1 + \cdots + X_n$. At time $m$, the distance from the origin is $|S_m|$. Let us compute $E[|S_{2m}|] = E[|X_1 + \cdots + X_{2m}|]$.

The random walk can only be at an even position, so $P(|S_{2m}| = j) = 0$ for $j$ odd. Now let $2k$ be an even integer in $[0,2m]$. By symmetry, $$P(X_1 + \cdots + X_{2m} = 2k) = P(X_1 + \cdots X_{2m} = -2k)$$ Let us compute the probability that the walker is at $2k$. If the walker moves $j$ steps forward and $2m - j$ steps backwards, he is at $2j - 2m = 2k$. Thus we require $k+m$ many $+1$'s. The sum of the $X_i$ is a binomial distribution with parameters $p = 1/2$ and $n = 2m$, so that $$P(S_{2m} = 2k) = \binom{2m}{m+k} \left(\frac12\right)^{2m}$$ and $$P(|S_{2m}| = 2k) = \binom{2m}{m+k} \left(\frac12\right)^{2m-1}.$$ Computing the expectation, \begin{align*} E[|S_{2m}|] &= \sum_{k=0}^{2m} k P(|S_{2m}| = k) \\ &= \sum_{k=0}^{m} 2k P(|S_{2m}| = 2k) \\ &= \sum_{k=0}^{m} k \left(\frac12\right)^{2m-2}\binom{2m}{k+m} \\ &= \frac{m+1}{2^{2m-1}}\binom{2 m}{m+1} \\ &= \frac{m+1}{2^{2m-1}} \cdot \frac{(2m)!}{(m+1)!(m-1)!} \\ &= \frac{m+1}{2^{2m-1}} \cdot \frac{(2m)!}{m!m!} \cdot \frac{m}{m+1} \\ &= \frac{2m}{2^{2m}} \binom{2m}{m}. \end{align*} Similarly, one can show that $$E[|S_{2m+1}|] = \frac{2m+1}{2^{2m}} \binom{2m}{m}.$$ Now we can compute the average distance. \begin{align*} E\left[\frac{|S_1| + \cdots + |S_{2n}|}{2n} \right] &= \frac{1}{2n}(E[|S_1|] + \cdots + E[|S_{2n}|]) \\ &= \frac{1}{2n}\left(\sum_{k=0}^{n-1} E[|S_{2k+1}|] + \sum_{k=1}^n E[|S_{2k}|]\right) \end{align*}

But isn't $E[|S_{2m}|]$ the expected distance at the end of the walk, and not during the walk? — tscizzle, Commented Dec 21, 2015 at 0:23
@tscizzle do you mind writing the random variable you're interested in? The wording of the question is unclear — user217285, Commented Dec 23, 2015 at 23:27
I think it would be $\frac{(S_1+...+S_n)}{n}$, using your definition of $S$, since $S$ is the distance from the origin, and I want the expected average distance from the origin of all the positions along the walk. — tscizzle, Commented Dec 24, 2015 at 0:05
@tscizzle $S_i$ is the position of the walk; the distance is $|S_i|$. For example, the random walk could be at position -5, and the distance would be 5. Unless you actually want to take into account it being at a negative location... — user217285, Commented Dec 24, 2015 at 1:15
you're 100% right, my bad. there should be absolute-value in there, as you say. — tscizzle, Commented Dec 24, 2015 at 5:09

Filip · Accepted Answer · 2019-12-10 15:21:10Z

In an arbitrary dimension d:

The average displacement from the origin during the walk depends on whether or not the displacement at the end of the walk is known. If the displacement at the end is not known, than the average displacement from the origin during the walk is given by: $$\frac{1}{N+1} \sum_{n=0}^N \langle R_i \rangle =\frac{l\sqrt{2}}{d(N+1)}\frac{ \Gamma \left(\frac{d+1}{2}\right) }{\Gamma \left( \frac{d}{2}\right)} \sum_{n=1}^N \sqrt{n} $$

Let $\vec{R}$ be the end-to-end distance vector of a random walk of fixed step length $|\vec{r}_i| = l$. $\vec{R}$ can then be expressed as $\displaystyle \vec{R} = \sum_{i=1}^N \vec{r}_i$, where $\vec{r}_i$ is the vector of the $i$-th step. The Root-Mean-Square End-to-End Distance is given by $\textrm{RMS}=\sqrt { \langle R^2 \rangle }$. Since the steps are mutually independent, the covariance of two steps $\vec{r}_i$ and $\vec{r}_j$ is zero if $i\neq j$ and $\textrm{Cov}(\vec{r}_i, \ \vec{r}_j)= \textrm{Var}(\vec{r}_i)$ if $i=j$. The variance of $ \vec{r}_i$ can be expressed as $ \textrm{Var}(\vec{r}_i)= \langle \vec{r}_i \cdot \vec{r}_i \rangle - \langle \vec{r}_i \rangle^2$. Due to symmetry $\langle \vec{r}_i \rangle=\vec{0}$ and therefore the variance of of $ \vec{r}_i$ is simply $ \textrm{Var}(\vec{r}_i)= \langle \vec{r}_i \cdot \vec{r}_i \rangle = |\vec{r}_i|^2 = l^2$. Altogether, the covariance of $\vec{r}_i$ and $\vec{r}_j$ equals $\textrm{Cov}(\vec{r}_i, \ \vec{r}_j)=\delta_{ij}l^2$. The covariance of $\vec{r}_i$ and $\vec{r}_j$ can also be expressed as $\textrm{Cov}(\vec{r}_i, \ \vec{r}_j) = \langle \vec{r}_i \cdot \vec{r}_j \rangle - \langle \vec{r}_i \rangle \cdot \langle \vec{r}_j \rangle$. Combining the two different expressions for the covariance and using that $\langle \vec{r}_i \rangle=0$, results in $\langle \vec{r}_i \cdot \vec{r}_j \rangle =\delta_{ij}l^2$. This result can be used to determine the RMS:

$$\textrm{RMS}=\sqrt { \langle R^2 \rangle } = \sqrt { \langle \vec{R} \cdot \vec{R} \rangle } =\sqrt { \big\langle \sum_{i=1}^N \vec{r}_i \cdot \sum_{j=1}^N \vec{r}_j \big\rangle } =\sqrt { \sum_{i=1}^N \sum_{j=1}^N \langle \vec{r}_i \cdot \vec{r}_j \rangle } =\sqrt { \sum_{i=1}^N \sum_{j=1}^N l^2 \delta_{ij} + 0^2}= \sqrt { \sum_{i=1}^N l^2}=\sqrt { N l^2}=l\sqrt { N }$$

Let $Z_i$ denote the $i$-th coordinate of the end-to-end distance vector $\vec{R}$ after $N$ steps, and let $X_i$ and $Y_i$ denote the number of steps taken in the $i$-th dimension in the positive and negative direction respectively. Then the set of random variables $\{X_i, Y_i\}_{i=1}^d$ follows a multinomial distribution with parameters $N$ and $\displaystyle p_i=\frac{N}{2d}$. For sufficiently large values of $N$, $\{X_i, Y_i\}_{i=1}^d$ are approximately iid (independent and identically distributed) Poisson random variables with parameters $\displaystyle \lambda_i = \frac{N}{2d}$. For $\lambda > 20$, i.e. $N>40d$, $\textrm{Po}(\lambda) \sim \textrm{N}(\lambda, \lambda)$. $ Z_i = l(X_i - Y_i)$ and therefore $\displaystyle Z_i \sim \textrm{N}(l(\lambda - \lambda), l^2(\lambda+\lambda))=\textrm{N}(0, 2l\lambda)=\textrm{N}\left(0, \frac{l^2N}{d}\right)$.

$\displaystyle \langle R \rangle = \langle \sqrt{R^2} \rangle = \left\langle \sqrt{ \sum_{i=1}^d Z_i^2} \right\rangle$. The square root of a sum of $k$ independent $\textrm{N}(0, 1)$-distributed random variables is distributed according to the chi distribution, $\chi_k$. Therefore $\displaystyle \sqrt{ \sum_{i=1}^d \frac{dZ_i^2}{l^2N}}$ is approximately $\chi_d$-distributed for large values of $N$. The expected value of a $\chi_k$-distributed random variable is $\displaystyle \sqrt{2} \frac{ \Gamma \left(\frac{k+1}{2}\right) }{\Gamma \left( \frac{k}{2}\right)}$.

Hence $\displaystyle \langle R \rangle =\left\langle\sqrt{ \sum_{i=1}^d Z_i^2}\right\rangle =\left\langle l \sqrt{\frac{N}{d}} \sqrt{ \sum_{i=1}^d \frac{dZ_i^2}{l^2N} }\right\rangle = l \sqrt{\frac{2N}{d} }\frac{ \Gamma \left(\frac{d+1}{2}\right) }{\Gamma \left( \frac{d}{2}\right)}$. \

If the displacement at the end is not known, than the average displacement from the origin during the walk is given by:

$$ \frac{1}{N+1} \sum_{n=0}^N \langle R_i \rangle =\frac{1}{N+1} \sum_{n=0}^N l \sqrt{\frac{2n}{d} }\frac{ \Gamma \left(\frac{d+1}{2}\right) }{\Gamma \left( \frac{d}{2}\right)} = \frac{l\sqrt{2}}{d(N+1)}\frac{ \Gamma \left(\frac{d+1}{2}\right) }{\Gamma \left( \frac{d}{2}\right)} \sum_{n=1}^N \sqrt{n} $$

Stack Exchange Network

Average distance from origin in a random walk on the integer number line

2 Answers 2

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged
integers
random-walk
multinomial-distribution
.

Linked

Hot Network Questions

Average distance from origin in a random walk on the integer number line

2 Answers 2

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged integersrandom-walkmultinomial-distribution.

Linked

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
integers
random-walk
multinomial-distribution
.