Skip to content

Fix plotting memory leak #9307

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

Conversation

qwhelan
Copy link
Contributor

@qwhelan qwhelan commented Jan 20, 2015

This PR resolves #9003 (and explains matplotlib/matplotlib#3892). The root cause of the memory leak is a reference cycle between MPLPlot objects and the AxesSubplot objects they create.

Specifically, a plotf function object is stored in ax._plot_data for the purposes of potentially redrawing if the data needs to be resampled. This would be fine if this were a top-level function; however, these are all nested functions that make use of self. This means that by plotf pulls in self.ax and self.axes, which point to the AxesSubplot that plotf is being attached to. We therefore have a reference cycle:

AxesSubplot -> AxesSubplot._plot_data -> plotf -> self -> self.ax -> AxesSubplot

In order to make the objects collectable, we need to either explicitly break a link or replace it with a weakref. Weakrefs don't work as AxesSubplot and MPLPlot have the same lifetime. Just not using _plot_data prevents the leak but breaks functionality.

The final option as I see it is to change plotf to not depend on self. This works but involves a fair amount of modifications. I elected to make each of the plotfs top-level functions to make the lack of self-dependency explicit. This also required making several other functions classmethods and moving some data from MPLPlot to the AxesSubplot object.

The key assumption being made by this change is that either MPLPlot objects are discarded immediately after use or we don't want any modifications to the MPLPlot (e.g., adding errors post-plotting) to be reflected if redrawing. I believe both cases are true but this patch has the potential for behavioral changes if MPLPlot objects are regularly being retained and modified.

I've also added a memory-leak test to prevent a regression; the test fails as expected if applied without the other commits in this patch.

@qwhelan qwhelan force-pushed the fix_memory_leak branch 2 times, most recently from 12dee0b to 1a9769a Compare January 25, 2015 01:32
@qwhelan
Copy link
Contributor Author

qwhelan commented Jan 25, 2015

@jreback I think this patch is ready for review. Please let me know if you have any questions or suggestions.

@jorisvandenbossche
Copy link
Member

cc @TomAugspurger @sinhrks

@jorisvandenbossche jorisvandenbossche added this to the 0.16.0 milestone Jan 25, 2015
@sinhrks
Copy link
Member

sinhrks commented Feb 7, 2015

Specifically, a plotf function object is stored in ax._plot_data

I understand you're referring to this. If so, only line and area are affected and no need to change others?

@qwhelan
Copy link
Contributor Author

qwhelan commented Feb 14, 2015

@sinhrks Yep, it's a subtle bug so I thought it'd be better to convert everything to a safer approach to prevent a reoccurrence.

@sinhrks
Copy link
Member

sinhrks commented Feb 15, 2015

Understood. What I'm caring is current impl makes it more difficult to understand the logic...

How about:

  • Change closure to normal function (not return a function, but uses a function defined in the module directly)
  • Make mapping from class to each plot function and unify separated _get_plot_function.
class LinePlot(MPLPlot):
    plotf = _lineplot_plotf
# or 
_plotf_mapping = {LinePlot: _lineplot_plotf, ...}

@jreback
Copy link
Contributor

jreback commented Mar 5, 2015

@sinhrks @TomAugspurger can you have a quick look?

@jreback
Copy link
Contributor

jreback commented Mar 5, 2015

@qwhelan can you squash to a single commit? and pls add a release note referncing the bug?

@jreback
Copy link
Contributor

jreback commented Mar 7, 2015

@TomAugspurger can you have a look?

1 similar comment
@jreback
Copy link
Contributor

jreback commented Mar 10, 2015

@TomAugspurger can you have a look?

@TomAugspurger
Copy link
Contributor

I've looked at this a bit. Haven't had a chance to fully digest it since I've never had to deal with this kind of problem before. I'll look more and let you know. It may be something to push for another release.

On Mar 10, 2015, at 18:20, jreback [email protected] wrote:

@TomAugspurger can you have a look?


Reply to this email directly or view it on GitHub.

@jreback jreback modified the milestones: 0.16.1, 0.16.0 Mar 11, 2015
@qwhelan
Copy link
Contributor Author

qwhelan commented Mar 19, 2015

@jreback Hey, sorry about the delay - made the mistake of letting Inbox decide what emails I see. I'll squash it now. Let me know what version I should put it under for the release notes.

@qwhelan
Copy link
Contributor Author

qwhelan commented Mar 19, 2015

@jreback Given we're at a rc1, I'm going to wait on adding to the docs until 0.16.1 is in development. As @sinhrks pointed out, it'd be possible to trim this down to just LinePlot and AreaPlot if that makes including this in a point-release more palatable.

@jreback
Copy link
Contributor

jreback commented Mar 20, 2015

whats new is now available for 0.16.1 5ebf521

@qwhelan
Copy link
Contributor Author

qwhelan commented Apr 19, 2015

Closing due to better version available in #9814

@qwhelan qwhelan closed this Apr 19, 2015
@jorisvandenbossche jorisvandenbossche modified the milestones: 0.17.0, 0.16.2 Jun 2, 2015
@jorisvandenbossche jorisvandenbossche modified the milestones: 0.16.2, 0.17.0, No action Jun 2, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Memory leak with pandas plot
5 participants