Skip to content

BUG: Segfault in to_json with tzaware datetime #45558

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jan 24, 2022

Conversation

lithomas1
Copy link
Member

Basically, timedelta.total_seconds returns a float. PyInt_AsLong(actually PyLong_AsLong) wants an int, but would cast using int for Pythons < 3.10. This stopped happening in python 3.10. (https://docs.python.org/3/c-api/long.html#c.PyLong_AsLong).

IMO, it probably is fine to cast the result from float to int, as I don't think that any timezone has an offset from UTC that contains fractions of a second. But, I could be wrong given that I don't use timezones and are not too familiar with them.

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice ping on greenish

@lithomas1 lithomas1 added Bug IO JSON read_json, to_json, json_normalize Segfault Non-Recoverable Error labels Jan 22, 2022
@lithomas1 lithomas1 added this to the 1.4.1 milestone Jan 22, 2022
Copy link
Member

@jbrockmendel jbrockmendel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lithomas1 lithomas1 marked this pull request as ready for review January 22, 2022 23:47
@@ -378,7 +378,7 @@ int convert_pydatetime_to_datetimestruct(PyObject *dtobj,
if (tmp == NULL) {
return -1;
}
seconds_offset = PyInt_AsLong(tmp);
seconds_offset = PyInt_AsLong(PyNumber_Long(tmp));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PyNumber_Long returns a new reference, so the way this is currently written it will leak memory. You ideally should break this into a few lines and be sure to call Py_DECREF on the object returned from PyNumber_Long

While touching this might as well also switch from PyInt_AsLong to PyLong_AsLong

Copy link
Member Author

@lithomas1 lithomas1 Jan 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While touching this might as well also switch from PyInt_AsLong to PyLong_AsLong

Will try to do this as a follow up as I'm planning to have this PR backported.

@@ -378,11 +379,14 @@ int convert_pydatetime_to_datetimestruct(PyObject *dtobj,
if (tmp == NULL) {
return -1;
}
seconds_offset = PyInt_AsLong(tmp);
tmp_int = PyNumber_Long(tmp);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You will also want to do a if (tmp_int == NULL) check and return an error / DECREF tmp if that happens. Otherwise lgtm

@WillAyd WillAyd merged commit 24e360a into pandas-dev:main Jan 24, 2022
@WillAyd
Copy link
Member

WillAyd commented Jan 24, 2022

Thanks @lithomas1 awesome work

meeseeksmachine pushed a commit to meeseeksmachine/pandas that referenced this pull request Jan 24, 2022
jreback pushed a commit that referenced this pull request Jan 24, 2022
@lithomas1 lithomas1 deleted the fix-json-segfault branch January 24, 2022 14:22
yehoshuadimarsky pushed a commit to yehoshuadimarsky/pandas that referenced this pull request Jul 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug IO JSON read_json, to_json, json_normalize Segfault Non-Recoverable Error
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: segfault from json.dumps in Python 3.10
5 participants