3
$\begingroup$

I hope this is the right section for this kind of questions.

I am trying to simulate, with MATLAB, a diffusion model starting from a Random Walk. I am using a Random Walk with information increment X normally distributed ($\mu, \sigma$ ). I also have a boundary $\alpha $, and $\alpha > \mu$. The starting point is 0.

If I understood this right, this should be an approximation of the Wiener Process. As wikipedia says (http://en.wikipedia.org/wiki/Inverse_Gaussian_distribution), the first passage time for a fixed level $\alpha > 0 $by $X_t$ is distributed according to an inverse-gaussian: $ T_\alpha = \inf\{ 0 < t \mid X_t=\alpha \} \sim IG(\tfrac\alpha\nu, \tfrac {\alpha^2} {\sigma^2}).\, $

What I am trying to do is to simulate a Random Walk and to get the first passage time distribution, verifying that it is actually a Inv. Gaussian with those parameters.

This is the code I have done: http://pastebin.com/E1N58sJ4

Notice that the myHist function is commented, but I normally use it: in that function I fit the resulting distribution to an inverse gaussian. Then I compare the fitted parameters with the two parameters obtained by the formula showed above, $\mu= \alpha/v $ and $ \lambda= \alpha^2/\sigma^2 $

However, the two results are NOT THE SAME, and they differ consistently across simulation. For example, with the parametrers used in the code:

Simulation - mu:29.1771 s: 116.7757 Expected - mu:26.3158 s: 100

This difference is consistent across repetition of simulation.

Can anyone spot the mistake?

{I wrote this question in CrossValidated, but I deleted the post there and I put it here, because it seems more appropriate. One -not definitive- answer that I've got regarded the time step size of my function. Unfortunately, I don't know how to change that, so if you could provide some help with the implementation that would be much appreciated}

$\endgroup$
3
  • $\begingroup$ Do you think this might be a implementation problem? If so, I would suggest posting on stack overflow $\endgroup$
    – oLas
    Commented Mar 27, 2014 at 19:28
  • $\begingroup$ I think that the code is doing exactly what I want to do. I am probably missing something about the concept behind the simulation of a wiener process using discreet steps. What I meant is that I would appreciate if any advice would also come with the implementation of the advice. For example, in Crossvalidation I received the useful help of Juho Kokkala, which is: "Your code uses a quite low sampling rate (Δt=1) for simulating the Wiener process [...] Indeed, by modifying your code to use Δt=0.01, I got a mean of simulated Tαs much closer to the analytically obtained 26.3. [...]" $\endgroup$
    – Vaaal88
    Commented Mar 27, 2014 at 19:32
  • $\begingroup$ However, I don't know how to implement that advice in the software (it is probably obvious to Juho Kokkala, but unfortunately I am not so skilled) $\endgroup$
    – Vaaal88
    Commented Mar 27, 2014 at 19:33

2 Answers 2

4
$\begingroup$

Your simulation code looks okay. Some suggestions. You're never going to get reliable estimates of your parameters by taking such large time steps when your boundary is so close. Also, only 500 runs is is pretty small, you might increase it to 3,000 or more. You might find this paper helpful in terms of learning about Wiener processes and how to implement them (including how to change the time step) in Matlab:

Desmond J. Higham, 2001, An Algorithmic Introduction to Numerical Simulation of Stochastic Differential Equations, SIAM Rev. (Educ. Sect.), 43 525–46. http://dx.doi.org/10.1137/S0036144500378302

The URL to the Matlab files in the paper won't work, use this one.

However, I think that your real issue may be in fitting the simulated data to obtain fitted parameters. Firstly, it might be a good idea to use ksdensity instead of histogramming:

[F,XI] = ksdensity(rt(1,:),'support','positive');
plot(XI,F)

Second, you should use a maximum likelihood estimation-based fitting method rather than generic nonlinear least squares. The fitdist functions use this for the most part and they already include support for the inverse Gaussian distribution:

pd = fitdist(rt(1,:).','InverseGaussian')

Finally, when performing psedo-random simulations, you should always set a seed to make your experiments repeatable: use rng.

$\endgroup$
3
  • $\begingroup$ hi horchler and thank you for your answer and for your suggestions. What made you think that I used a nonlinear least square for the fitting? $\endgroup$
    – Vaaal88
    Commented Mar 29, 2014 at 13:25
  • $\begingroup$ @Vaaal: The fact that it's the default method used by fit (which doesn't support MLE) and that you were using a histogram. And that your answers were off (and presumably had large confidence intervals too). Of course the time steps are a problem, but I still think that the root cause of your error is in fitting. What were you doing in myHist? $\endgroup$
    – horchler
    Commented Mar 29, 2014 at 16:02
  • $\begingroup$ Hi Horchler, sorry I should have been more specific in my comment. Inside myHist I am using MLE to fit the distribution. The keyword 'fit' as an optional parameter in myHist does not refer to the fit function in matlab. Sorry for the confusion. $\endgroup$
    – Vaaal88
    Commented Mar 29, 2014 at 18:18
3
$\begingroup$

I did not spot any mistakes in the code. The discrepancy is probably due to low sampling rate used in simulating the Wiener process. Based on comments, you also need some background information about what is going on when simulating the Wiener process by discrete steps. In this answer, I first give some background information and then illustrate the error caused by simulating the boundary crossing time by too low sampling rate.

Background

I denote by $W_t$ the standard Wiener process (with no drift), whence \begin{equation} X_t = W_t + \nu\,t,~t\geq 0 \end{equation} is the process you want to simulate. So, it suffices to simulate $W_t$ and adding the drift is simple. The Wiener process (http://en.wikipedia.org/wiki/Wiener_Process) is a continuous-time stochastic process with $W_t=0$ and the increments are distributed as \begin{equation} W_t - W_s \sim N(0, t-s) \end{equation} furthermore, increments over separate intervals (say, $W_3-W_2$ and $W_2-W_1$) are independent. Note that $t-s$ here is the variance of the increment, not standard deviation. We cannot actually simulate a continuous realization from this process using a computer + random number generator, but must resort to discrete samples, e.g., at times $\Delta t, 2\Delta t, 3\Delta t$ etc. Based on the independent increments property, we can obtain these samples by the following algorithm

  1. Let $W_0 = 0$
  2. For t = $\Delta t, 2\Delta t, \ldots$: $W_{(k+1)\Delta t} = W_{k\Delta_t} + N(0,\Delta t)$

For Wiener process, this method is exact in the sense that the distribution of thse $W_{k\Delta t}$ will be correct (also the joint distribution). The values of $X_t$ at the same points are obtained by simply adding the drift.

The cause of the error described in the question

Your code uses the method I described previously, with $\Delta t = 1$, to simulate $X_t$ for $t=1,2,\ldots$. Then, you find the first $X_t$ that exceeds the boundary, and estimate the boundary-crossing time using linear interpolation between $X_{t-1},X_t$. This corresponds to approximating the process $X_t$ by linear interpolation for non-integer $t$ and finding where this interpolated process crosses the boundary. However, it turns out that this approximation overestimates $T_\alpha$ on average, because it misses cases where the continuous process $X_t$ crosses the boundary and returns below the boundary before next sampled value.

See the following figure for an illustration, I simulated $X_t$ with $\Delta t=0.01$ (the red curve). The blue curve is simulation with $\Delta t = 1$ (which I obtained by taking every 100th value of the red curve). The blue curve corresponds to what your code simulates. The vertical black lines describe the times when the simulated processes cross the boundary. The red curve crosses the boundary at $T_\alpha = 20.1$, but as it returns below the boundary before $t = 21$, the blue curve does not cross the boundary here but only much later at $\hat{T}_\alpha = 29.5$. Apparently, this is typical enough to cause the discrepancy described in your code. However, note that I 'cheated' and simulated several datasets before obtaining a discrepancy this big, it doesn't happen every time).

Implementation tip

Your code has a line amounting to $X_{k+1} = X_k + N(v,1)$. Change this to $X_{\Delta t(k+1)} = X_{\Delta t\,k} + N(\Delta t\,v,\Delta t)$ to achieve higher sampling rate. When finding the boundary crossing time, note that the times are now $\Delta t,2\Delta t,\ldots$ instead of $1,2,\ldots$.

$\endgroup$
4
  • $\begingroup$ Hi Juho, thank you very much for your fantastic answer. It was clear and incredibly helpful. $\endgroup$
    – Vaaal88
    Commented Mar 29, 2014 at 13:28
  • $\begingroup$ Juho, in the section "Implementation tip", shouldn't the formula be N(dtv, dts) ? Apparently is not. But, therefore, where should I use "s" in the first place? $\endgroup$
    – Vaaal88
    Commented Mar 29, 2014 at 19:01
  • $\begingroup$ Let me restate the question: how could I change the variance of the process? Can i just replace N(dtv,dt) with N(dtv, s*dt)? $\endgroup$
    – Vaaal88
    Commented Mar 29, 2014 at 23:34
  • 1
    $\begingroup$ If $\sigma^2$ is the variance of increments with $\Delta t = 1$, then the variance of the increments with a general $\Delta t$ is $\sigma^2\,\Delta t$. $\endgroup$ Commented Mar 31, 2014 at 11:11

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .