0
$\begingroup$

Say I conducted a numerical experiment -a time-dependent fluid simulation- and extracted a time-series of some variable. It looks like this: enter image description here

As a result of the simulation, I am supposed to report the mean value of this signal along with some estimate how accurate this mean value is from a purely statistical point of view. The Idea is: If I conduct the same experiment several times, I get different signals with different mean values. These mean values would be normally distributed. The mean value and standard deviation of this distribution are unknown.

Now if I could perform the experiment several times, estimating the variance of the mean would be straightforward. Then I could estimate some confidence intervals using Student's t-distribution. Yet since compute resources are scarce, this is not an option. All I have is this one sample.

tl;dr: Is there a way to estimate a confidence interval for the mean when all you have is one sample?

I was thinking about making use of the fact that the signals autocorrelation function vanishes rather quickly. But I haven't been able to turn this into a reliable method. enter image description here

$\endgroup$
8
  • $\begingroup$ This question boils down on whether you can assume/justifiably postulate your random process to be ergodic. $\endgroup$ Commented Apr 21, 2017 at 13:48
  • $\begingroup$ The process is ergodic in the sense that the time-average and the ensemble-average converge towards the same mean value for an infinite sample time and an infinite amount of samples. Is that what you asked for? $\endgroup$
    – MechEng
    Commented Apr 21, 2017 at 13:58
  • $\begingroup$ exactly, if you can make that statement, then you can make a statement about the ensemble variance from the sample variance, right? $\endgroup$ Commented Apr 21, 2017 at 14:01
  • $\begingroup$ Yes. This is what I briefly outlined for the case where I actually had several samples. The catch is that I only have one sample, so estimating a sample variance is not straightforward. $\endgroup$
    – MechEng
    Commented Apr 21, 2017 at 14:05
  • $\begingroup$ exactly, but it might be impossible to make any statement on the ensemble when you don't have / can't claim ergodicity. But: If, as you say, the process is ergodic, then the sample variance == ensemble variance, and you're done as soon as you estimate the sample variance :) $\endgroup$ Commented Apr 21, 2017 at 14:08

1 Answer 1

1
$\begingroup$

To the extent you can assume your process is a white stationary ergodic process, the variance of the mean is the variance of the process divided by N where N is the number of samples.

Assuming that you can estimate the variance from your sample using

$$\sigma_x = \frac{1}{N-1}\sum_{i=1}^N(x_i-\mu_x)^2$$

NOTE: This is the unbiased estimator since the variance is being estimated from a sample of N elements using a mean that is estimated from the sample itself. (see Zwillinger, D. (Ed.). CRC Standard Mathematical Tables and Formulae. Boca Raton, FL: CRC Press, 1995.). Otherwise if the mean was known (which it isn't) there would be a $\frac{1}{N}$ in front of the summation above.

You should then be able to derive confidence intervals on the mean itself with the variance of the mean as $\frac{\sigma_x}{N}$:.

This is seen in the expression for the mean and variance of $Y$ where $Y$ is the average of $X$ over $N$ samples, as follows where $\mu_x$ is the mean of x and $\sigma_x$ is the variance of x:

\begin{align} E(Y) &= E\left[\frac{1}{N}\left(X_1+X_2+ \ldots +X_N\right)\right] = \frac{1}{N}E\left[(X_1+X_2+ \ldots +X_N)\right]\\ \textrm{Var}(Y) &= \textrm{Var}\left[\frac{1}{N}\left(X_1+X_2+ \ldots +X_N\right)\right] = \frac{1}{N^2}\textrm{Var}\left(X_1+X_2+ \ldots+X_N\right) \end{align}

\begin{align} E\left[(X_1+X_2+ \ldots +X_N)\right] &= E[X_1]+E[X_2]+ \ldots E[X_N] = N \mu_x\\ \textrm{Var}(X_1+X_2+ \ldots +X_N) &= \textrm{Var}(X_1)+\textrm{Var}(X_2) + \ldots \textrm{Var}(X_N) = N\sigma_x^2 \end{align}

Therefore \begin{align} E(Y) &= \frac{1}{N}E[(X_1+X_2+ \ldots +X_N)] =\frac{1}{N}N\mu_x = \mu_x\quad\text{and}\\ \textrm{Var}(Y) &= \frac{1}{N^2}\textrm{Var}(X_1+X_2+ \ldots +X_N) = \frac{1}{N^2}N\sigma^2 = \frac{\sigma^2}{N} \end{align}

How to make use of the Autocorrelation Function

Note that this reduction by N is valid as long as the sequence is white. Once samples are correlated, there will be no further reduction in the variance of the estimate. You can therefore make use of your autocorrelation to determine the number of samples that will reduce the variance. I do not have an exact calculation for this, but to provide a rough order magnitude I would use the number of samples that results in the normalized autocorrelation dropping to below 0.5. For example, in your data the total number of samples is 30,000. If the autocorrelion immediately dropped below 0.5 after just one sample, then all samples are independent and your variance estimate would be the $\sigma_x/30,000$. However if the autocorrelation does not drop below 0.5 until 100 samples, then the variance estimate would be $\sigma_x/300$.

$\endgroup$
4
  • $\begingroup$ I am aware of this. It is what I briefly outlined for the case where I actually have more than one sample. But it does not apply here since I can not estimate the sample variance. $\endgroup$
    – MechEng
    Commented Apr 21, 2017 at 16:50
  • $\begingroup$ Why can't you estimate the sample variance? I added what I would do to estimate the sample variance, so not sure yet what I am missing- can you clarify? $\endgroup$ Commented Apr 21, 2017 at 17:04
  • $\begingroup$ When you say "one sample" do you literally mean N=1? If so there is absolutely no information you can get regarding mean and variance, unless you have some sort of prior information about the signal. That makes entirely no sense, so I had assumed of course you meant "one sample set"...now I am not so sure? $\endgroup$ Commented Apr 21, 2017 at 17:20
  • $\begingroup$ @MechEng Can you clarify this further based on my last comments? $\endgroup$ Commented Apr 23, 2017 at 2:36

Not the answer you're looking for? Browse other questions tagged or ask your own question.