Understanding the Fokker-Planck equation for non-stationary processes

Question

I'm currently studying stochastic processes for the first time in the context of physics (Langevin dynamics), and I've come across a few conceptual difficulties regarding the Fokker-Planck equation which I want to clear up. The general form I'm looking at is:

$$\frac{\partial p}{\partial t} = - \frac{\partial }{\partial x} \left( \mu(x,t) \ p \right) + \frac{\partial^2 }{\partial x^2} \left( D(x,t) \ p \right).$$

The boundary condition is absorption at infinity: the probability current $$j(x,t) \equiv \mu (x,t) p - \frac{\partial }{\partial x} \left( D(x,t) \ p \right) \to 0, \quad |x| \to \infty \quad \forall t.$$

Although this isn't really relevant to my questions.

First of all, what is the function $p$ in the equation? In the notes I'm using, it's the conditional pdf: $p = p(x,t|x_0,t_0)$ (this is actually how the equation is derived). Elsewhere though (e.g. wikipedia), it stands for the regular pdf: $p = p(x,t)$.

Question 1. Which is it? I am tempted to say it doesn't matter, because by the total probability rule,

$$ p(x,t) = \int dx' dt'\ p(x,t|x',t')p(x',t'),$$

so I can multiply the PDE through $p(x',t')$ and integrate over $x'$ and $t'$, transforming the conditional pdf to the regular pdf. If it's the regular pdf, then I can pull the $p(x',t')$ and the integrals in front and argue that the equation is true for any $p(x',t')$, thus transforming the regular pdf into the conditional pdf. Is this argument correct?

Of course what will change are the initial conditions, which brings me to my next point.

My notes only consider processes that are time translation invariant, so $\mu$ and $D$ have no time dependence, and $p(x,t|x_0,t_0) = p(x,t-t_0|x_0,0) \equiv p(x,t-t_0|x_0)$. Then the initial condition (for the conditional pdf) is:

$$p(x,0|x_0) = \delta(x-x_0).$$

After solving this, I know $p(x,t|x_0)$, so by the total probability rule I can find $p(x,t)$ by integrating over all values of $x_0$, provided I know their initial distribution $g(x_0)$:

$$p(x,t) = \int dx_0 \ p(x,t|x_0) g(x_0)$$

On the other hand, if the PDE is for the regular pdf, the initial condition is:

$$p(x,0) = g (x),$$

Solving this PDE of course gives $p(x,t)$ directly.

Question 2. How do I extend this to non-stationary processes? For the second case (regular pdf) everything stays the same, I think. But in the first case (conditional pdf), which initial condition do I want?

$$p(x,0|x_0,t_0) = \ ?$$

Also, how do I retrieve $p(x,t)$ after solving for $p(x,t|x_0,t_0)$ ? I have the initial distribution $g(x_0)$, but I can't use the total probability theorem as before. It seems like there's one time variable missing in all this? Should my new $g$ be a function of two variables?

Question 3. The conditional pdf for the stationary process $p(x,t|x_0)$ appears to have the interpretation of a Green's function (integrating it with the initial condition $g(x)$ yields the sought total pdf). But the Fokker-Planck equation isn't of the form $$L p(x,t) = g(x),$$ with some linear differential operator $L$. In fact, the Fokker-Planck equation is homogeneous. So, for which operator (PDE), if any, is $p(x,t|x_0)$ the Green's function? I think maybe I understand Green's functions wrong...

EDIT: I put up a bounty because I'm looking for an answer which would specifically address the three questions posed in detail.

I had a similar question and I have not received an answer. The Fokker-Plank (Forward Kolmogorov) is typically solved with Fourier Transform, and the delta function makes it easy to invert the solution back. However, I always wondered why we can't solve FK by setting $p(x,0)=g(x)$. In my opinion, if I am modeling a system of 10 men, to say that all of them start at $X(0)=x$ does not make sense. However, for other system this is perfectly fine. For example, a bird population at $t_0$ has a fixed number of members; hence, $X(0)=x$. — Edv Beq, Commented May 25, 2017 at 1:34
@EdvBeq It depends on your system. $p(x,0)$ means that your system is stationary, i.e your probability density is independent of time. So your moments of your random variable won't change to the time. This is not always true. Consider a fluctuation of a particle in a potential pot, the fluctuation causes time change of moments. But in a large time scale, the system relaxes and the particle will rest at a stable state. In this case, the large time scale dynamic is described by a stationary process as the particle rests. But prior to that your system is not stationary. — quallenjäger, Commented May 26, 2017 at 8:54

Tom-Tom · Accepted Answer · 2017-06-01 22:31:06Z

In the Fokker-Planck equation, the unknown function (called here $p$) is a spatial probability density function at a given time $t$. We can write the Fokker-Planck equation as follows : $$\left\{\begin{array}{ll}\frac{\partial p}{\partial t}+\frac{\partial}{\partial x}\big(\mu(x,t)p\big) -\frac{\partial^2}{\partial x^2}\left(D(x,t)p\right)=0\\ p(x,t_0)=f(x).\end{array}\right.$$ This govern the evolution of the probability distribution function from the initial condition $f(x)$. An important property of the Fokker-Planck equation is the called mass conservation. The quantity $\int_{\mathbb R}p(x,t)\mathrm dx$ is independent of $t$ and is equal to $\int_{\mathbb R}f(x)\mathrm dx$. If $f(x)=\delta(x-x_0)$, we have $\int_{\mathbb R}p(x,t)\mathrm dx=1$ for all time $t$. As the equation is linear, we deduce that if we call $p(x,t|x_0,t_0)$ the solution of the Fokker-Planck equation with $f(x)=\delta(x-x_0)$ then the solution $p(x,t|f,t_0)$ to the Fokker-Planck equation starting at $t_0$ with the probability distribution $f$ is $$p(x,t|f,t_0)=\int_{\mathbb R}p(x,t|x_0,t_0)f(x_0)\mathrm dx_0.$$ Note that this is another form of the "probability rule" (also called the Chapman-Kolmogorov relation) because $f(x)=p(x,t_0|f,t_0)$. We have just shown that $p(x,t|x_0,t_0)$ is the Green's function of the operator $$\frac{\partial}{\partial t}+\frac\partial{\partial x}\Big(\mu(x,t)\cdot\Big)-\frac{\partial^2}{\partial x^2}\Big(D(x,t)\cdot\Big).$$

In the case of time dependent coefficient $\mu$ and $D$, the question becomes very much dependent on the actual expressions of $\mu$ and $D$. For instance, if $D(x,t)=\mathscr Dt$ and $\mu=0$, the solution is exactly obtained from the usual Green's function as $$p(x,t|0,0)=\frac{1}{\sqrt{2\pi\mathscr Dt^2}}\exp\left(-\frac{x^2}{2\mathscr D t^2}\right).$$ But this is an exceptional situation. There are usually no exact solutions. If the time variation of $\mu$ and $D$ are bounded, a possible approach is to use a multiple-scale expansion consisting of introducing several slow time scales and a perturbation parameter. This is more robust than the standard perturbative expansion especially for time-dependent problems. Many other techniques exist, such as the method of matched asymptotic expansions. Solving time dependent partial differential equations is a difficult problem in general.

Nikolaj-K · Accepted Answer · 2017-05-25 00:24:58Z

You may or may not be familiar with the result that a random walk, even a one-dimensional one with equal probability for kinds of steps, has a physically speaking rather odd behavior regarding the expected walking distance.

Imagine a drunk guy on a street with large tiles, who wanders around without self-control. Let's fix a time and a distance scale. Say at each second he makes one step over a tile, either to the left or to the right. Say at each second, let's say you got equal chance for either side. At second $t=0$, you got

$p(x=0, t=0)=1=1\cdot 2^{0}$

and zero chance for any other position. At second $t=1$, he moved away to the left or to the right, so

$p(x=0, t=1)=0$

and

$p(x=\pm 1, t=1)=\frac{1}{2}=1\cdot 2^{-1}$

At $t=2$, there's a chance of $\frac{1}{2}^{2}$ he's landed on the outermost possible position, so

$p(x=\pm 2, t=2)=\frac{1}{4}=1\cdot 2^{-2}$

while there are two paths he could have gone to end up in the middle again

$p(x=0, t=2)=2\cdot 2^{-2} = \frac{1}{2}$

You can compute the chances from the diagram below.

Here's it's clear now that the outermost position will have the exponentially falling chance of $2^{-n}$, because it would mean you get e.g. "he went left" $n$ times in a row. On the other hand, there are more and more paths that turned around and come bank to the center.

Importantly, the distribution of probabilities is one that spreads out.

Surely, the expected position after any number of (even) time steps, the average of all steps, is $x(t=n)=0$. Let's say ${\rm E}[X_n]=0$.

But the movement is anything but linear and the expected distance (unsigned, disregarding whether he effectively moved left or right) turns out to go as $n^\tfrac{1}{2}$, i.e. "$|x(t=n)|=\sqrt{n}$". Or, to capture this more formally, ${\rm E}[X_t^2]=t$ or, for the next paragraph, ${\rm E}[X_t^2]^\tfrac{1}{2}=t^\tfrac{1}{2}$.

I said this is physically odd, because you can't represent this typical (unsigned) path by an ODE, like you'd do in mechanics 101. The function $x(t)=\sqrt{t}$ has no velocity

$v(t=0):=\lim_{t\to 0}\lim_{h\to 0}\dfrac{x(t+h)-x(t)}{h}$

at the origin. It's clear from the steep graph of $\sqrt{t}$ at the origin.

The lesson is that you want to use a whole pdf to describe e.g. a particles motion for situation where random walk model applies, e.g. in gases where the random steps come from unpredictable pushes of other particles from all sides.

This all might seem like a long precursor, but the takeaway I want you to get from this is that here an unevenness in powers arose in $${\rm E}[X_t^2]=t$$

You have a length unit to the power of 2 on the left, and a time unit to the power of 1 on the right. This is the pattern that follows through everything related to the Brownian process, which provides the distribution for this situation - the information that I captured for the first $n=5$ steps in the discrete case above. The probability w.r.t. $x$ looks like $$p_D(x,t) = \dfrac{1}{\sqrt{4\pi}}\left(\dfrac{1}{D\,t}\right)^\frac{1}{2}\exp\left({-\dfrac{1}{D\,t}\left(\dfrac{x}{2}\right)^2}\right)$$

That density function is such that for small $t$ it's very sharp and high, becuase of the $\frac{1}{\sqrt{t}}$, and then as $t$ grows is falls and spreads.

Here $D$ regulates how time- and a length scales interact. You may choose a different time scale and thus renormalize $D$ to 1.

At this point you want to do a unit analysis. This is a density w.r.t. $x$, you find $\left(\dfrac{1}{D\,t}\right)^\frac{1}{2}$ must have the units of one over length, and thus $D$ is length squared per time, i.e. the same units as $x^2/t$. You may also arrive at that conclusion from $\dfrac{1}{D\,t}\left(\dfrac{x}{2}\right)^2$, by the observation that the argument of a power series like the $\exp$ must be unit less. It may be worth mentioning that now $v_D(t):=\left(\dfrac{D}{t}\right)^\frac{1}{2}$ makes for a velocity.

Now take a look at the differential equation this exp-function is the solution of, the diffusion equation $$\dfrac{\partial}{\partial{}t} p_D(x,t) = D \dfrac{\partial^2}{\partial{}x^2} p_D(x,t)$$

Here again, check the units. For this to check out, $D$ must share units of $x^2/t$.

Let's step a step back and try and understand why this describes diffusion in the first place. Consider the function $h(x):=7 x^2$. We have

$\dfrac{\partial^2}{\partial{}x^2} h(x) = 7\cdot 2>0$

Consider the function $k(x):=-5 x^2$. We have

$\dfrac{\partial^2}{\partial{}x^2} k(x) = -5\cdot 2<0$

The equations say that the gain of the density at a position $x=\zeta$, i.e. the chance $\dfrac{\partial}{\partial{}t} p_D(x=\zeta, t)$, equals the curvature of the function at the same position $x=\zeta$. Now look at the plots with the three $\exp$-functions posted above. Wherever the function (or any function) behaves concave like $-x^2$, the differential equation will steal from there, and wherever the equation is convex like $+x^2$, the differential equation will reward there. This is also why the turning point of the curvature are hardly moving. The differential equation describes a diffusion of value from concave to convex. And the differential operator is linear, meaning that if you overlap two exponentials with different center points, say, then resulting two concave peaks will be punished just the same.

The function describes a drunken guy motion in this sense. You may say at $t_0$ a person is at $x_0$ (the $p(x,t=0|x_0,t_0)=\delta(x-x_0)$ initial condition you mention), and then $p(x,t|x_0,t_0)$ will in this context give you, at time $t$, the $x$-distribution as it evolved from $x_0$ after the time $\Delta t = t-t_0$ has passed. The classical Greens function application would be when you have an eletrical charge that acts as source for an electrical field, ${\rm div}\, E=\delta$. Here $p$ acts like a Green function in that for certainty of position $x_0$ at $t_0$, it tells you how the knowledge diffuses, and if you got 10 independent drunken men you can't distinguish, then you can you have 10 densities that eventually merge together leaving you with one more or less flat blob of I don't know where they are now. The bulk here just always spreads out and away from its won peaks, not from an external source.

Your $g$ would be an initial distribution (don't need to be 10 sharp peaks), and naturally depend on $x$ and be given for a particular time $t_0$.

Yes as far as the context in your question goes, it doesn't matter if it's a conditional probability and formally depends on $x',t'$. The function could also depend on your moms back account. The differential equality is one in $x$ and $t$ and the normalization is w.r.t. $x$. And if the initial condition is one with a delta at $x_0$, then the solution will have $x_0$ too anyway. Keeping track of the $x_0$ is relevant when you do e.g. path integrals in quantum mechanics (the Schrödinger equation has also this form, but with imaginary $D$), or Kalman filters/Recursive Bayesian estimation in sensor fusion (i.e. whenever you do anything that looks like a smooth version of Bayes theorem).

And an equation with non-constant $D(x,t)$ just means the peak-penalty is determined locally. Just like smoking weed is penalized differently in different countries. The $\mu$ (check the units) induces a drift of the center in time, or a deterministic draw if you consider stochastic processes.

Thanks very much for the answer. Could you also specifically answer the points in questions 2&3? Especially the treatment of a non-stationary process (full generality) and setting up an initial condition; would it depend on an extra time variable? Also, Brownian motion doesn't admit a stationary distribution, yet it seems that the solution (pdf) is still a function of $t-t_0$ - why? — Spine Feast, Commented May 25, 2017 at 9:48
@SpineFeast: I'm not sure about your definitions. For the case with $D$ constant, you have the explicit solution, time dependent in of itself and you can easily compute the variance and study it's behavior. This function always describes the flowing away of information, if you will. In general, there's also the buzzword "Feynman–Kac formula". A Brownian motion, characterized by the Wiener process, has an explicit condition for time difference, see here. Formalize your "non-stationary" condition and maybe then it's clear what's up. — Nikolaj-K, Commented May 25, 2017 at 12:28
By stationary process I mean one where there exists a stationary distribution, so a nontrivial solution to the FPE with the LHS equal to 0. Then, in the limit $t \to \infty$, any initial distribution will tend to that stationary distribution. For a stationary process, it's always the case that $p(x,t|x_0,t_0) = p(x,t-t_0|x_0,0)$. But this is also true for Brownian motion which isn't stationary. But I suppose that's just because I have $\mu$ and $D$ independent of time. — Spine Feast, Commented May 25, 2017 at 12:46
So I guess stationary implies $\mu, D$ indep. of time, but not the other way around. — Spine Feast, Commented May 25, 2017 at 12:47
@SpineFeast: Aha, so you speak of a very straight forward notion of stationary. Well consider $\mu(x,t)=0$, then the resulting stationary equation $\dfrac{∂^2}{∂x^2}\left(D(x,t)\cdot p(x, t)\right)=0$ is solved by $p(x, t) = \dfrac{a(t)+b(t)\cdot x}{ D(x,t)}$ and you have a condition that this shall be normalizable. The possible solutions will heavily depend on the dimensionality of $x$, and wether it's bounded. And fyi of course here you deal with the Laplace equation now. If you want formal answers, formalize the problem. — Nikolaj-K, Commented May 25, 2017 at 12:53

Stack Exchange Network

Understanding the Fokker-Planck equation for non-stationary processes

2 Answers 2

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged
partial-differential-equations
stochastic-processes
.

Linked

Hot Network Questions

Understanding the Fokker-Planck equation for non-stationary processes

2 Answers 2

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged partial-differential-equationsstochastic-processes.

Linked

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
partial-differential-equations
stochastic-processes
.