I have 50 years of rainfall data and it is categorised based on intensity over a region (eg:1-10mm/day, 11-20mm/day ect.). I want to test the trends in the frequencies of each category. Is Mann-Kendall trend test appropriate for testing trends for this kind of data or is there any better option available?
-
1$\begingroup$ Why is intensity bucketed? If you have the unbucketed, continuous information, that would be so much more informative as inputs to any model. $\endgroup$– user78229Commented Apr 4, 2017 at 12:09
-
$\begingroup$ I want to know the trends in each category (intensity) of rainfall. $\endgroup$– ajileshCommented Apr 5, 2017 at 5:28
-
1$\begingroup$ So, bucket it after building the model based on continuous rainfall. $\endgroup$– user78229Commented Apr 5, 2017 at 10:36
-
$\begingroup$ Doesn't it work this way? $\endgroup$– ajileshCommented Apr 6, 2017 at 9:19
-
1$\begingroup$ By "work this way," are you referring to your a priori bucketing? Of course it can be done "this way" but what naive analysts don't realize is that they are throwing tons of information out the window in the process. Just think about it "this way" -- you have 50 years of daily information containing rainfall information expressed in millimeters. That is over 18,000 data points. By bucketing this information -- even into percentiles -- you are collapsing this information down into a much smaller range of possible values. $\endgroup$– user78229Commented Apr 6, 2017 at 12:46
1 Answer
Using all 50 years of data would provide a really long term measure of trend. Why not compare that with the most recent 10 years, the most recent 1 year, the most recent month, and so on, to tease out how the time series is evolving? Evaluating the slopes of these different data partitions would also be informative as a positive slope suggests stronger growth (and vice versa).
Your data is a prime candidate for the classic, univariate time series models such as Box-Jenkins, ARIMA and their many variants. These models enable evaluation of nontrivial issues inherent to long-range time series such as autocorrelation, unit roots, cointegration, stationarity, trend drift, lead-lag relationships, inertia, and so on. The goal is to model with HAC errors (heteroscedastic, autocorrelation consistent or robust errors as in a paper by Newey and West from 1987, A Simple, Positive Semi-Definite, HAC Covariance Matrix). You would want to evaluate and control for issues related to seasonality and global warming.
There are various tests for each:
Differencing the time series usually controls for stationarity, Dickey has a useful paper about that as well as unit roots here ... http://www2.sas.com/proceedings/sugi30/192-30.pdf Another good discussion of unit roots is in Sims and Uhlig's paper Understanding Unit Rooters: A Helicopter Tour ... http://home.uchicago.edu/~huhlig/papers/uhlig.sims.econometrica.1991.pdf Other unit root tests are the Phillips-Perron test, the Ng-Perron test and the Kwiatkowski, Phillips, Schmidt, and Shin (KPSS) Unit Root Test.
Models usually assume independently distributed errors. Dependence or autocorrelated errors signal model misspecification, the classic test is the Durbin-Watson. More recent tests are available such as the BDS test, the runs test, the turning point test, Ljung-Box test and the rank version of the von Neumann ratio test. The Ljung-Box test can also be used for detecting seasonality as well as trend magnitude.
The Engle-Granger test is useful for evaluating issues related to cointegration ... http://www2.warwick.ac.uk/fac/soc/economics/staff/gboero/personal/hand2_cointeg.pdf
A good, not commonly leveraged metric is the Hurst Exponent, H. It has some nice properties, particularly for long range time series such as yours. Here's a good review of this metric ... http://www.financialwisdomforum.org/gummy-stuff/hurst.htm Also here ... http://sfb649.wiwi.hu-berlin.de/fedc_homepage/xplore/tutorials/xfghtmlnode99.html
The Mann-Kendall test is fine for detecting monotonic trends. Another nonparametric measure would be the Spearman correlation.
-
$\begingroup$ (1) I find the formulation of the goal a bit strange: The goal is to model with HAC errors. (2) Could rainfall have a unit root? Somehow it does not seem natural. And if it cannot, than the low power of the unit root tests could become an issue. (3) I would skip Durbin-Watson and go directly to Ljung-Box as inexperienced users might not appreciate the limitations of the DW test; but this is just a preference. (4) Ljung-Box test detecting trend magnitude? Interesting. Do you have a reference? (5) How is Engle-Granger test and cointegration applicable for univariate data? $\endgroup$ Commented Apr 4, 2017 at 13:36
-
$\begingroup$ @RichardHardy All excellent, totally orthodox points (I expect nothing less from you) and appropriately nuanced for both the problem and the OP. My response was much more of a shotgun or laundry list of possible approaches. You should answer this one. Regarding L-B test, a search for this test with other keywords such as "trend magnitude" will uncover some literature. Wrt (5) I was thinking of a transfer function model. $\endgroup$ Commented Apr 4, 2017 at 13:52