0

I have multiple data frames with a datetime column as a string. Datetime formats vary across columns in the dataframe or across dataframes. I want to get a unix timestamp that gets interpreted by an ArcGIS application into the local timezone.

For example, one such dataframe is the following:

import pandas as pd
time_dict = {"datetime": ["2023-08-15T15:32:47.687+00:00", ""]}
test_df = pd.DataFrame(time_dict)

I tried a number of simple functions. When I submitted the unix timestamp to the ArcGIS application, it gave a date-time that was 4 hours earlier than the local time. When I tried to make some corrections for the timezone, I encountered a TypeError saying that the timezone was already present in the date.

I will provide a solution below, but maybe someone has a better one.

6
  • Unix time refers to an epoch in UTC (1970-01-01); there is no such thing as "unix timestamp that reflects the local timezone". That would be confusing at best. Commented May 8 at 15:31
  • @FObersteiner I revised the ask to better reflect the problem. ArcGIS interprets the unix timestamp as a datetime. The datetime shown was 4 hours earlier than the local time. I think this is because python converts the datetime to UTC.
    – Ted M.
    Commented May 8 at 17:41
  • ok so first thing to clarify, if you're feeding Unix time into some application (ArcGIS), it's the application's job to handle time zones. Unix time does not carry that. The other thing is that in your example above, you're parsing a string, which has an offset from UTC ("+00:00"; you could say that is UTC). You could convert that to another timezone, but for Unix time output, that would make no difference. Commented May 8 at 18:18
  • Yes, it's possible that the application is not handling the time zones as expected or desired. As a result, I needed a workaround. And this is one of the reasons for the problem being mentioned. I was creating date-times with datetime.now(), converting to unix time, pushing to the application, and then seeing a very different time from the one on my computer.
    – Ted M.
    Commented May 8 at 19:06
  • oh ok, careful there with mixing pandas datetime and vanilla Python datetime. Another caveat coming up ^^ If you call datetime.now().timestamp(), Python will assume that you want local date/time, so to calculate Unix time, it will convert that to UTC internally, then calculate Unix time. pandas on the other hand, using e.g. pd.TImestamp("now").timestamp() will give you something that looks like local, but behaves like UTC (compare the timestamps...). So, some footguns available. Commented May 8 at 19:12

2 Answers 2

0

Here is a solution that I found to work well for the problem mentioned above:

import pandas as pd
from datetime import datetime
import tzlocal

def unix_datetime(df, col):
    """
    Convert string datetime to unix datetime format for dataframe column
    It reformats the date to remove existing time zone information that raises a TypeError
    It properly accounts for the timezone.
    df is name of dataframe
    col is the column name as a string
    """
    time_zone = tzlocal.get_localzone_name()
    df[col] = pd.to_datetime(df[col]).dt.strftime("%Y-%m-%d %I:%M:%S %p")
    df[col] = pd.to_datetime(df[col], errors="coerce")
    df[col] = df[col].dt.tz_localize(time_zone).dt.tz_convert(time_zone)
    df[col] = df[col].apply(lambda x: int(x.timestamp() * 1000) if pd.notnull(x) else x)
    
time_dict = {"datetime": ["2023-08-15T15:32:47.687+00:00", ""]}

test_df = pd.DataFrame(time_dict)

test_df = test_df.fillna("").copy()

unix_datetime(test_df, "datetime")
4
  • .dt.tz_localize(time_zone).dt.tz_convert(time_zone) is a no-op; what are you trying to achieve here? Commented May 8 at 15:33
  • @FObersteiner As mentioned above, an ArcGIS application interprets the unix timestamp as a datetime. Since python converts the date-time to UTC, the date-time shown was 4 hours earlier than local time. I had to use .dt.tz_localize(time_zone).dt.tz_convert(time_zone) to convert to local time to get the correct interpretation of the unix time stamp
    – Ted M.
    Commented May 8 at 17:46
  • just to illustrate my comment; compare the output from pd.to_datetime("2023-08-15T15:32:47.687+00:00").timestamp() and pd.to_datetime("2023-08-15T20:32:47.687+05:00").timestamp(). Notice that there are different times and UTC offsets in the datetime strings. However, both map to the same Unix time. So those datetime strings could originate from different time zones, but they would represent the same instant in time, therefore Unix time is the same as well. Commented May 8 at 18:22
  • 1
    That makes perfect sense, and it's what I would expect.
    – Ted M.
    Commented May 8 at 19:10
0

A colleague suggested that we just change the data type for dates in the ArcGIS application to String!

Not the answer you're looking for? Browse other questions tagged or ask your own question.