13
$\begingroup$

I have a forecasting model for a time series and I want to calculate its out-of-sample prediction error. At the moment the strategy I'm following is the one suggested on Rob Hyndman's blog (near the bottom of the page) which goes like this (assuming a time series $y_1,\dots,y_n$ and a training set of size $k$)

  1. Fit the model to the data $y_t,\dots,y_{t+k-1}$ and let $\hat{y}_{t+k}$ be the forecast for the next observation.
  2. Compute the forecast error as $e_{t} = \hat{y}_{t+k} - y_{t+k}$.
  3. Repeat for $t=1,\dots,n-k$
  4. Compute the mean square error as $\textrm{MSE}=\frac{1}{n-k}\sum_{t=1}^{n-k} e_t^2$

My question is how much I have to worry about correlations because of my overlapping training sets. In particular, say I want to forecast not only the next value, but the next $m$ values, so that I have predictions $\hat{y}_{t+k},\dots,\hat{y}_{t+k+m-1}$ and errors $e_{t,1},\dots,e_{t,m}$, and I want to construct a term-structure of prediction errors.

Can I still roll the window of the training set forward by 1 each time, or should I roll it forward by $m$? How do the answers to these questions change if there is significant autocorrelation in the series that I'm predicting (conceivably it is a long-memory process, i.e. the autocorrelation function decays as a power law rather than exponentially.)

I'd appreciate either an explanation here, or links to somewhere where I can find theoretical results about the confidence intervals around the MSE (or other error measures).

$\endgroup$
0

1 Answer 1

12
+25
$\begingroup$

It sounds like you might be more interested in estimating errors using the maximum-entropy bootstrap, rather than cross-validation. This will allow you to generate multiple bootstraps of you data, which you can then split into as many train/test sets as you like to calculate confidence intervals for your forecasts.

Rob Hyndman has some further discussion of time series cross-validation on his blog, where he implements several different methods of "rolling" and forecasting, but it's mostly focused on implementation. I have some further implementations on my blog as well. Maybe the simplest approach would be to average your error across all of the time windows, and therefore ignore and potential correlations in errors.

As far as I can tell, the theoretical state of cross-validation for time-series data is somewhat behind the theoretical state of general cross-validation. Intuitively, I expect error to increase as the horizon increases, which suggests that you should expect correlated errors across various forecast horizons. Why does this worry you?

$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.