Timeline for AIC versus cross validation in time series: the small sample case

Current License: CC BY-SA 3.0

28 events

when toggle format	what		by	license	comment
Feb 13 at 12:07	answer	added	Richard Hardy		timeline score: 0
May 11, 2020 at 6:25	comment	added	Richard Hardy		@OldSchool, thank you for the interesting ideas. Could you give a reference for the first one, or maybe even write your own answer explaining it? I have not read the book you cite, so I do not have a view on it. It will be interesting to take a look if I can find the book.
May 8, 2020 at 3:06	comment	added	OldSchool		@RichardHardy. In the case where you wish to ensure that test folds always chronologically follow the training set and never precede it, you can still construct train and validation folds so that they are all the same size. You won't get as much reuse out of resampling, but it can be arranged. Would that allay your concern abut CV systematically favouring too-parsimonious models? Then there is also the possibility of purging and embargoing the test folds as described in Lopez de Prado's 2018 book Advances in Financial Machine Learning. What's your view on the approach taken there?
Jul 25, 2019 at 9:14	answer	added	Tim		timeline score: 7
Jul 25, 2019 at 6:28	answer	added	StoryTeller0815		timeline score: 5
Jun 30, 2019 at 11:01	history	bumped	CommunityBot		This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
Mar 1, 2019 at 14:01	history	bumped	CommunityBot		This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
Jan 30, 2019 at 9:21	answer	added	Dovini Jayasinghe		timeline score: 0
Dec 8, 2018 at 19:45	comment	added	Richard Hardy		@IsabellaGhement, why should it? There is no reason to restrict ourselves to this particular use of cross validation. This is not to say that cross validation cannot be used for model assessment, of course.
Dec 8, 2018 at 18:13	comment	added	Isabella Ghement		Maybe I am missing something in this thread, but wouldn't time series cross-validation already assume that you have a model selected and that you are trying to assess the accuracy of the forecasts it produces?
Nov 30, 2018 at 20:49	comment	added	Richard Hardy		@F.Tusell, thank you for your insight. If I remember correctly, AICc is just a second-order asymptotic approximation as compared to AIC's first order. So it is just a more precise version of AIC, and that applies regardless of the sample size, but that mainly becomes important when the sample size is small. Just to say that even with AICc we are not getting away from the asymptotic justification for the method.
Nov 30, 2018 at 18:46	comment	added	F. Tusell		I think several papers by Clifford Hurvich and others address this problem in the context of different models. If I remember well, a variant called AICc was proposed to address shortcomings of AIC in small samples --small samples are a problem not only for cross-validation. These papers are dated from 1989 onwards, in the Journal of Time Series Analysis and Biometrika I think.
Oct 26, 2018 at 15:01	history	bumped	CommunityBot		This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
Sep 21, 2018 at 21:01	history	bumped	CommunityBot		This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
Aug 22, 2018 at 4:01	history	bumped	CommunityBot		This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
Jul 21, 2018 at 22:17	history	bumped	CommunityBot		This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.
Jun 17, 2018 at 14:04	comment	added	meh		Agreed. It seems that any exact formula about the statistics of an empirical observations amounts to CLT at some point. A priori the fact that CV is empirical is clear whereas for AIC it isn't. To me, CV seems more like the MLE, but of course that can be bad for small data sets too.
Jun 16, 2018 at 5:30	comment	added	Richard Hardy		@aginensky, thanks. I think interesting properties of CV are also asymptotic (aren't they?), so the question whether to choose AIC or CV is still nontrivial (though I expect the bias towards simpler models might be too big in CV such that AIC would be preferred; I wonder about the variances).
Jun 15, 2018 at 20:39	comment	added	meh		I've spend a fair amount of time trying to understand AIC. The equality of the statement is based on numerous approximations that amount to versions of the CLT. I personally think this makes AIC very questionable for small samples.
Jun 15, 2018 at 20:37	comment	added	meh		There is this on this site about AIC/BIC v. CV. stats.stackexchange.com/questions/577/… .
Jun 15, 2018 at 19:41	comment	added	Analyst		There are theoretical reasons for favoring AIC or BIC since if one starts with likelihood and information theory, then metric which is based on those has well known statistical properties. But often it is that one is dealing with data set which is not so large.
Feb 13, 2018 at 16:53	history	edited	Richard Hardy	CC BY-SA 3.0	edited title
Apr 13, 2017 at 12:44	history	edited	CommunityBot		replaced http://stats.stackexchange.com/ with https://stats.stackexchange.com/
Nov 25, 2015 at 4:21	history	tweeted			twitter.com/StackStats/status/669370241236017152
Feb 26, 2015 at 7:10	comment	added	Richard Hardy		@CagdasOzgenc, I asked Rob J. Hyndman regarding whether cross validation is likely to systematically favour too-parsimonious models in the context given in the OP and got a confirmation, so that is quite encouraging. I mean, the idea I was trying to explain in the chat seems to be valid.
Feb 25, 2015 at 13:11	comment	added	probabilityislogic		I would also imagine BIC is also equivalent to a "longer" forecast (m-step ahead), given its link to leave k out cross validation. For 200 observations though, probably doesn't make much difference (penalty of 5p instead of 2p).
Feb 25, 2015 at 12:20	history	edited	Richard Hardy	CC BY-SA 3.0	added 117 characters in body
Feb 25, 2015 at 11:36	history	asked	Richard Hardy	CC BY-SA 3.0

toggle format