Fix Memory Leak in to_json with Numeric Values #26239

WillAyd · 2019-04-29T18:03:12Z

closes Memory leak in df.to_json #24889
tests added / passed
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry

It looks like the extension module is unnecessarily incrementing the reference count for numeric objects and never releasing it, which causes a leak in to_json.

Still trying to grok exactly what the purpose of npyCtxtPassthru (comment in same file mentions it is required when encoding multi-dimensional arrays).

Removing the check for PyArray_ISDATETIME caused segfaults but that doesn't appear to leak memory anyway so most likely intentional to increment that ref count.

ASV results:

       before           after         ratio
     [9feb3ad9]       [9bfd45d3]
     <master>         <json-mem-fix~1>
-           69.1M            57.7M     0.84  io.json.ToJSONMem.peakmem_float
-           69.1M            57.7M     0.83  io.json.ToJSONMem.peakmem_int

SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.

codecov · 2019-04-29T18:42:04Z

Codecov Report

Merging #26239 into master will decrease coverage by <.01%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master   #26239      +/-   ##
==========================================
- Coverage   91.97%   91.96%   -0.01%     
==========================================
  Files         175      175              
  Lines       52368    52368              
==========================================
- Hits        48164    48160       -4     
- Misses       4204     4208       +4

Flag	Coverage Δ
#multiple	`90.52% <ø> (ø)`	⬆️
#single	`40.69% <ø> (-0.15%)`	⬇️

Impacted Files	Coverage Δ
pandas/io/gbq.py	`78.94% <0%> (-10.53%)`	⬇️
pandas/core/frame.py	`96.9% <0%> (-0.12%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9feb3ad...24e1a23. Read the comment docs.

codecov · 2019-04-29T18:42:24Z

Codecov Report

Merging #26239 into master will decrease coverage by <.01%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master   #26239      +/-   ##
==========================================
- Coverage   91.97%   91.96%   -0.01%     
==========================================
  Files         175      175              
  Lines       52368    52368              
==========================================
- Hits        48164    48160       -4     
- Misses       4204     4208       +4

Flag	Coverage Δ
#multiple	`90.52% <ø> (ø)`	⬆️
#single	`40.69% <ø> (-0.15%)`	⬇️

Impacted Files	Coverage Δ
pandas/io/gbq.py	`78.94% <0%> (-10.53%)`	⬇️
pandas/core/frame.py	`96.9% <0%> (-0.12%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9feb3ad...24e1a23. Read the comment docs.

jreback · 2019-04-30T13:04:09Z

thanks @WillAyd

WillAyd added 4 commits April 29, 2019 09:49

Removed PyArray_Number special handling

9bfd45d

Added ASV for memory benchmark

94ca581

Expanded benchmark to cover floats

a908bc0

Added whatsnew

24e1a23

WillAyd added the IO JSON read_json, to_json, json_normalize label Apr 29, 2019

jreback added this to the 0.25.0 milestone Apr 30, 2019

jreback merged commit 5cb006f into pandas-dev:master Apr 30, 2019

WillAyd deleted the json-mem-fix branch January 16, 2020 00:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix Memory Leak in to_json with Numeric Values #26239

Fix Memory Leak in to_json with Numeric Values #26239

WillAyd commented Apr 29, 2019

codecov bot commented Apr 29, 2019

codecov bot commented Apr 29, 2019 •

edited

Loading

jreback commented Apr 30, 2019

Fix Memory Leak in to_json with Numeric Values #26239

Fix Memory Leak in to_json with Numeric Values #26239

Conversation

WillAyd commented Apr 29, 2019

codecov bot commented Apr 29, 2019

Codecov Report

codecov bot commented Apr 29, 2019 • edited Loading

Codecov Report

jreback commented Apr 30, 2019

codecov bot commented Apr 29, 2019 •

edited

Loading