Fix Memory Leak in to_json with Numeric Values #26239
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
df.to_json
#24889git diff upstream/master -u -- "*.py" | flake8 --diff
It looks like the extension module is unnecessarily incrementing the reference count for numeric objects and never releasing it, which causes a leak in to_json.
Still trying to grok exactly what the purpose of npyCtxtPassthru (comment in same file mentions it is required when encoding multi-dimensional arrays).
Removing the check for PyArray_ISDATETIME caused segfaults but that doesn't appear to leak memory anyway so most likely intentional to increment that ref count.
ASV results: