I have a small time series with monthly intervals. I wanted to plot it and then decompose into seasonality, trend, residuals. I start by importing csv into pandas and than plotting just the time series which works fine. I follow This tutorial and my code goes like this:
%matplotlib inline
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import pandas as pd
ali3 = pd.read_csv('C:\\Users\\ALI\\Desktop\\CSV\\index\\ZIAM\\ME\\ME_DATA_7_MONTH_AVG_PROFIT\\data.csv',
names=['Date', 'Month','AverageProfit'],
index_col=['Date'],
parse_dates=True)
\* Delete month column which is a string */
del ali3['Month']
ali3
plt.plot(ali3)
At this stage I try to do the seasonal decompose like this:
import statsmodels.api as sm
res = sm.tsa.seasonal_decompose(ali3.AverageProfit)
fig = res.plot()
which results in the following error:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-41-afeab639d13b> in <module>()
1 import statsmodels.api as sm
----> 2 res = sm.tsa.seasonal_decompose(ali3.AverageProfit)
3 fig = res.plot()
C:\Users\D063375\AppData\Local\Continuum\Anaconda2\lib\site-packages\statsmodels\tsa\seasonal.py in seasonal_decompose(x, model, filt, freq)
86 filt = np.repeat(1./freq, freq)
87
---> 88 trend = convolution_filter(x, filt)
89
90 # nan pad for conformability - convolve doesn't do it
C:\Users\D063375\AppData\Local\Continuum\Anaconda2\lib\site-packages\statsmodels\tsa\filters\filtertools.py in convolution_filter(x, filt, nsides)
287
288 if filt.ndim == 1 or min(filt.shape) == 1:
--> 289 result = signal.convolve(x, filt, mode='valid')
290 elif filt.ndim == 2:
291 nlags = filt.shape[0]
C:\Users\D063375\AppData\Local\Continuum\Anaconda2\lib\site-packages\scipy\signal\signaltools.py in convolve(in1, in2, mode)
468 return correlate(volume, kernel[slice_obj].conj(), mode)
469 else:
--> 470 return correlate(volume, kernel[slice_obj], mode)
471
472
C:\Users\D063375\AppData\Local\Continuum\Anaconda2\lib\site-packages\scipy\signal\signaltools.py in correlate(in1, in2, mode)
158
159 if mode == 'valid':
--> 160 _check_valid_mode_shapes(in1.shape, in2.shape)
161 # numpy is significantly faster for 1d
162 if in1.ndim == 1 and in2.ndim == 1:
C:\Users\D063375\AppData\Local\Continuum\Anaconda2\lib\site-packages\scipy\signal\signaltools.py in _check_valid_mode_shapes(shape1, shape2)
70 if not d1 >= d2:
71 raise ValueError(
---> 72 "in1 should have at least as many items as in2 in "
73 "every dimension for 'valid' mode.")
74
ValueError: in1 should have at least as many items as in2 in every dimension for 'valid' mode.
Can anyone shed some light on what I'm doing wrong and how may I fix it? much obliged.
Edit: Thats how the data frame looks like
Date AverageProfit
2015-06-01 29.990231
2015-07-01 26.080038
2015-08-01 25.640862
2015-09-01 25.346447
2015-10-01 27.386001
2015-11-01 26.357709
2015-12-01 25.260644
freq
value, that is the seasonality timescale?