Skip to content

Error when converting df to json table (utc timezone date time object causes the error) #39537

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
franz101 opened this issue Feb 1, 2021 · 4 comments
Labels
Bug IO JSON read_json, to_json, json_normalize Timezones Timezone data dtype

Comments

@franz101
Copy link

franz101 commented Feb 1, 2021

When converting df.to_json(orient="table",index=False)
and there are datetime.now(timezone.utc) objects in the table it causes the following error (Without orient table it works though.):

`---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
in
----> 1 x = current_opportunites.to_json(orient="table",index=False)

/opt/conda/lib/python3.8/site-packages/pandas/core/generic.py in to_json(self, path_or_buf, orient, date_format, double_precision, force_ascii, date_unit, default_handler, lines, compression, index, indent, storage_options)
2463 indent = indent or 0
2464
-> 2465 return json.to_json(
2466 path_or_buf=path_or_buf,
2467 obj=self,

/opt/conda/lib/python3.8/site-packages/pandas/io/json/_json.py in to_json(path_or_buf, obj, orient, date_format, double_precision, force_ascii, date_unit, default_handler, lines, compression, index, indent, storage_options)
83 raise NotImplementedError("'obj' should be a Series or a DataFrame")
84
---> 85 s = writer(
86 obj,
87 orient=orient,

/opt/conda/lib/python3.8/site-packages/pandas/io/json/_json.py in init(self, obj, orient, date_format, double_precision, ensure_ascii, date_unit, index, default_handler, indent)
249 raise ValueError(msg)
250
--> 251 self.schema = build_table_schema(obj, index=self.index)
252
253 # NotImplemented on a column MultiIndex

/opt/conda/lib/python3.8/site-packages/pandas/io/json/_table_schema.py in build_table_schema(data, index, primary_key, version)
261 if data.ndim > 1:
262 for column, s in data.items():
--> 263 fields.append(convert_pandas_type_to_json_field(s))
264 else:
265 fields.append(convert_pandas_type_to_json_field(data))

/opt/conda/lib/python3.8/site-packages/pandas/io/json/_table_schema.py in convert_pandas_type_to_json_field(arr)
122 field["freq"] = dtype.freq.freqstr
123 elif is_datetime64tz_dtype(dtype):
--> 124 field["tz"] = dtype.tz.zone
125 return field
126

AttributeError: 'datetime.timezone' object has no attribute 'zone'`

elif is_datetime64tz_dtype(dtype):

Will have a deeper look later.

@attack68
Copy link
Contributor

attack68 commented Feb 1, 2021

can you share a minimalist reproducible snippet to validate?

@franz101
Copy link
Author

franz101 commented Feb 2, 2021

test = pd.DataFrame([{"name":"foo","time":datetime.now(timezone.utc)}]).to_json(
orient="table")

this works like a charm
pd.DataFrame([{"name":"foo","time":datetime.now(timezone.utc)}]).to_json(
orient="records")

@franz101
Copy link
Author

franz101 commented Feb 2, 2021

Not sure what the best practise is with datetime, json and pandas.
pd.DataFrame([{"name":"foo","time":datetime.now(timezone.utc)}]).to_json(
orient="records")

won't parse the dates...
pd.read_json(test)

also won't work with date_format='iso'

this will return the datetimeformat...
json.dumps(pd.DataFrame([{"name":"foo","time":datetime.now(timezone.utc)}]).to_dict("records"
),default=str)

Maybe I made some small mistakes. what is the best practice here with datetime and utc timezones?

@attack68
Copy link
Contributor

attack68 commented Feb 2, 2021

to_json and read_json with timezone support only works with orient="table". The code may render with other orient types but the timezone info will be lost, i.e. converted to UTC.

The error you are getting can be patched with:

#_table_schema.py # line 123

    elif is_datetime64tz_dtype(dtype):
        try:
            field["tz"] = dtype.tz.zone
        except AttributeError:
            field["tz"] = dtype.name[15:len(dtype.name)-1]

Apparently the way you are creating a DateTime object, e.g.:

val = datetime.now(timezone(timedelta(hours=1)))
test = pd.DataFrame({"name": ["foo", "bar"], "time": [val, val + timedelta(days=1)]})

leads to the AttributeError you are getting, and the tz info must be extracted elsewhere from the dtype object. Perhaps the above alternative (albeit ugly) might also work in all cases.

Note if you define your datetimes using pandas natives then you don't suffer the error in the first place:

val = pd.to_datetime(datetime.now()).tz_localize('UTC')
test = pd.DataFrame({"name": ["foo", "bar"], "time": [val, val + timedelta(days=1)]})

also the following works if you convert your objects, which are initialised with a timezone, to pandas native :

val = pd.to_datetime(datetime.now(timezone(timedelta(hours=1)))).tz_convert('UTC')
test = pd.DataFrame({"name": ["foo", "bar"], "time": [val, val + timedelta(days=1)]})

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug IO JSON read_json, to_json, json_normalize Timezones Timezone data dtype
Projects
None yet
Development

No branches or pull requests

3 participants