8

I have a Pandas DataFrame as below.

df
                              A           B
date_time                                    
2014-07-01 06:03:59.614000  62.1250       NaN
2014-07-01 06:03:59.692000  62.2500       NaN
2014-07-01 06:13:34.524000  62.2500  241.0625
2014-07-01 06:13:34.602000  62.2500  241.5000
2014-07-01 06:15:05.399000  62.2500  241.3750
2014-07-01 06:15:05.399000  62.2500  241.2500
2014-07-01 06:15:42.004000  62.2375  241.2500
2014-07-01 06:15:42.082000  62.2375  241.3750
2014-07-01 06:15:42.082000  62.2375  240.2500

I want to change the frequency of this to regular 1 minute intervals. But get the error below:

new = df.asfreq('1Min')
>>error: cannot reindex from a duplicate axis

Now, I understand why this is happening. Since my time granularity is high (in milliseconds) but irregular, I get multiple readings per minute, even per second. So I tried to combine these millisecond readings to minutes and get rid of duplicates as below.

# try to convert the index to minutes and drop duplicates
df['index'] = df.index
df['minute_index']= df['index'].apply( lambda x: x.strftime('%Y-%m-%d %H:%M'))
df.drop_duplicates(cols = 'minute_index', inplace = True, take_last = True)
df_by_minute = df.set_index('minute_index')
df_by_minute
                        A                B               index
minute_index                                                     
2014-07-01 06:03    62.2500        NaN 2014-07-01 06:03:59.692000
2014-07-01 06:13    62.2500     241.50 2014-07-01 06:13:34.602000
2014-07-01 06:15    62.2375     240.25 2014-07-01 06:15:42.082000

# now change the frequency to 1 minute but I just get NaNs (!)
df_by_minute.asfreq('1Min')
                            A          B   index
2014-07-01 06:03:00        NaN        NaN   NaT
2014-07-01 06:04:00        NaN        NaN   NaT
2014-07-01 06:05:00        NaN        NaN   NaT
2014-07-01 06:06:00        NaN        NaN   NaT
2014-07-01 06:07:00        NaN        NaN   NaT
2014-07-01 06:08:00        NaN        NaN   NaT
2014-07-01 06:09:00        NaN        NaN   NaT
2014-07-01 06:10:00        NaN        NaN   NaT
2014-07-01 06:11:00        NaN        NaN   NaT
2014-07-01 06:12:00        NaN        NaN   NaT
2014-07-01 06:13:00        NaN        NaN   NaT
2014-07-01 06:14:00        NaN        NaN   NaT
2014-07-01 06:15:00        NaN        NaN   NaT

As you see it does not work.. Can someone help? What I am trying to achieve is to get a function that returns A or B as of DateTime and DateTime would be in 1Min increments.

2 Answers 2

6

I think, not asfreq but resample fits your needs:

new = df.resample('T', how='mean')

For how option, you can also use 'last' or 'first'.

1
  • thank you! Quite frustrating that I completely overlooked this function.
    – Zhubarb
    Commented Oct 14, 2014 at 12:39
1

Change Time Frequency in Pandas Dataframe.resample() function of data frame has ability to change the time frequency from Seconds to Minutes, Hours, Days, Years, etc. It perfectly works with DatetimeIndex fields and datetime column.

Dataset time frequency

Convert Minutes to Hours time frequency

Convert Hours to Days time frequency

Not the answer you're looking for? Browse other questions tagged or ask your own question.