-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
STYLE enable pylint: method-cache-max-size-none #49751
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
STYLE enable pylint: method-cache-max-size-none #49751
Conversation
Thanks @natmokval - not sure, I think this might be a case where the check might be telling us something useful @mroeschke do you happen to remember why |
Not too familiar with excel styling, but probably was thinking the opposite, why should the cache size be limited? Could the caching be refactored to an independent function like in https://pylint.pycqa.org/en/latest/user_guide/messages/warning/method-cache-max-size-none.html? |
Thanks for checking @MarcoGorelli. In case I set |
Thanks, @mroeschke. On the surface I don’t see how the below content can be moved out of the class:
Can you please advise me if you see a direction? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should fix this one, my programming messiah Anthony Sottile has a video on how lru_cache on methods should be avoided: https://youtu.be/sVjtp6tGo0g
I haven't tested this out yet, but I'm "requesting changes" to block this for now
If I am understanding how this is used from my cursory scan of the code, this is used in So to justify changing maxsize from |
The number of total possible CSS properties must only be 100-200. So to use them all per cell is never done. V Rare to go more that 10. Of course potentially lots more independent declarations if going by number of cells, but I'm assuming you meant the former. |
Thanks - let's just go with the default (128) and put |
Sounds good |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked into this a bit more closely (and trying to repeat what Anthony does in the video I linked), and I think there's a real issue here
For this investigation, I applied the following diff:
diff --git a/pandas/io/formats/excel.py b/pandas/io/formats/excel.py
index a26b85390f..3a8a9ee960 100644
--- a/pandas/io/formats/excel.py
+++ b/pandas/io/formats/excel.py
@@ -170,6 +170,9 @@ class CSSToExcelConverter:
self.inherited = self.compute_css(inherited)
else:
self.inherited = None
+
+ def __del__(self) -> None:
+ print('heeeelp I\'m being deeeleteeeed')
compute_css = CSSResolver()
@@ -193,6 +196,7 @@ class CSSToExcelConverter:
A style as interpreted by ExcelWriter when found in
ExcelCell.style.
"""
+ print('I\'m being computed (not looked up in cache)')
properties = self.compute_css(declarations, self.inherited)
return self.build_xlstyle(properties)
Once converter.__call__
has been called garbage collection no longer deletes converter
!
In [1]: from pandas.io.formats.excel import CSSToExcelConverter
In [2]: import gc
In [3]: converter = CSSToExcelConverter("font-weight: bold")
In [4]: converter('vertical-align: top')
I'm being computed (not looked up in cache)
Out[4]: {'alignment': {'vertical': 'top'}, 'font': {'bold': True}}
In [5]: converter = None
In [6]: gc.collect() # converter doesn't get deleted!
Out[6]: 373
However, if we then apply this trick from the video:
diff --git a/pandas/io/formats/excel.py b/pandas/io/formats/excel.py
index a26b85390f..692dd23e95 100644
--- a/pandas/io/formats/excel.py
+++ b/pandas/io/formats/excel.py
@@ -170,12 +170,20 @@ class CSSToExcelConverter:
self.inherited = self.compute_css(inherited)
else:
self.inherited = None
+ self._call_cached = lru_cache(maxsize=None)(self._call_uncached)
compute_css = CSSResolver()
- @lru_cache(maxsize=None)
def __call__(
self, declarations: str | frozenset[tuple[str, str]]
+ ) -> dict[str, dict[str, str]]:
+ return self._call_cached(declarations)
+
+ def _call_uncached(
+ self, declarations: str | frozenset[tuple[str, str]]
) -> dict[str, dict[str, str]]:
"""
Convert CSS declarations to ExcelWriter style.
@@ -193,6 +201,7 @@ class CSSToExcelConverter:
A style as interpreted by ExcelWriter when found in
ExcelCell.style.
"""
properties = self.compute_css(declarations, self.inherited)
return self.build_xlstyle(properties)
then it looks like it solves the issue:
In [14]: converter = CSSToExcelConverter("font-weight: bold")
In [15]: converter('vertical-align: top')
I'm being computed (not looked up in cache)
Out[15]: {'alignment': {'vertical': 'top'}, 'font': {'bold': True}}
In [16]: converter('vertical-align: top') # caching still works
Out[16]: {'alignment': {'vertical': 'top'}, 'font': {'bold': True}}
In [17]: converter = None
In [18]: gc.collect() # now, it gets deleted!
heeeelp I'm being deeeleteeeed
Out[18]: 698
I think what needs doing, then is to apply the second diff from #49751 (review) , but:
|
Changes from the second diff are applied. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good pending green, well done!
Couple last requests:
- this should probably be backported to 1.5.2, could you add a note to the v1.5.2 whatsnew note please? Just say that a memory leak has been fixed, and mention the class
- just left a really minor comment
Co-authored-by: Marco Edward Gorelli <[email protected]>
Mentioned the memory leak in v1.5.2 whatsnew with issue #48855. Maybe it would be better to skip the number or create a new issue? |
Co-authored-by: Matthew Roeschke <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. cc @MarcoGorelli merge when ready
The PR number is fine if there's no issue open Cool, let's wait for tests and we can ship it |
@@ -134,7 +134,6 @@ disable = [ | |||
"invalid-envvar-default", | |||
"invalid-overridden-method", | |||
"keyword-arg-before-vararg", | |||
"method-cache-max-size-none", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MarcoGorelli have we been using pylint before releasing 1.5? Not sure if this change will cause a merge conflict if backported, and I don't think it's worth backporting pylint configurations to 1.5.x
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree about not backporting the configuration, I'd just backport the fix itself - I'd expect to get a merge conflict, I'll sort this out there
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @natmokval !
Oops, something went wrong applying the patch ... Please have a look at my logs. |
Thank you @MarcoGorelli and @mroeschke for your help and support. Your explanations helped me to understand the problem. |
* STYLE enable pylint: method-cache-max-size-none * STYLE enable pylint: method-cache-max-size-none II * Minor comment update Co-authored-by: Marco Edward Gorelli <[email protected]> * add a note to the v1.5.2 whatsnew * text improvement Co-authored-by: Matthew Roeschke <[email protected]> Co-authored-by: Marco Edward Gorelli <[email protected]> Co-authored-by: Matthew Roeschke <[email protected]>
Issue #48855. This PR enables pylint type "W" warning:
method-cache-max-size-none
Based on issue #47371 PERF: Improve Styler to_excel Performance I believe the warning is a false-positive in pandas/io/formats/excel.py#L176.
I disabled this warning on the line 176