I stuck at one issue related group by when the time interval is more than 1 day. It's giving the wrong start time for different grain when grain is more than 1 day.
- Grain = 1day
- ExpectedStartTime=2019-01-01
- ActualStartTime=2019-01-01
> select mean("messages") from rabbitmq where host='rabbitmq_cluster' and time>='2019-01-01 00:00:00' and time<'2019-01-16 00:00:00' GROUP BY time(1d), "host" LIMIT 2;
time mean_messages
---- ----
2019-01-01T00:00:00Z 181232
2019-01-02T00:00:00Z 179728
- Grain = 2day
- ExpectedStartTime=2019-01-01
- ActualStartTime=2018-12-31
> select mean("messages") from rabbitmq where host='rabbitmq_cluster' and time>='2019-01-01 00:00:00' and time<'2019-01-16 00:00:00' GROUP BY time(2d), "host" LIMIT 2;
time mean_messages
---- ----
2018-12-31T00:00:00Z 181232
2019-01-02T00:00:00Z 347824
- Grain = 5day
- ExpectedStartTime=2019-01-01
- ActualStartTime=2018-12-30
> select mean("messages") from rabbitmq where host='rabbitmq_cluster' and time>='2019-01-01 00:00:00' and time<'2019-01-16 00:00:00' GROUP BY time(5d), "host" LIMIT 2;
time mean_messages
---- ----
2018-12-30T00:00:00Z 529056
2019-01-04T00:00:00Z 826694.3999999762
I read in the documentation that Influx uses present time boundary, but doesn't say how present time boundary is calculated. Is it a start of a month or a start of a week or time of first data received or time of the shard starting?
If I know how this present time boundary is being calculated, I can specify offset in groupby to keep the first slot starting from 2019-01-01.