-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
More speedups for Period comparisons #21606
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #21606 +/- ##
==========================================
- Coverage 91.9% 91.89% -0.01%
==========================================
Files 154 154
Lines 49562 49543 -19
==========================================
- Hits 45549 45530 -19
Misses 4013 4013
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add this issue to your existing perf issue (always do that on follow ons)
pandas/_libs/tslibs/offsets.pyx
Outdated
@@ -313,6 +313,30 @@ class _BaseOffset(object): | |||
def __setattr__(self, name, value): | |||
raise AttributeError("DateOffset objects are immutable.") | |||
|
|||
def __eq__(self, other): | |||
if is_string_object(other): | |||
from pandas.tseries.frequencies import to_offset |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe you want to define a function to return this import, we use this in a number of places?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't this just be trading one one-liner for another?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no then then its a function which we can import at the top (and of course if we eventually are able to actually move to_offset to cython its a big win). but its 'cleaner' I think. (certainly can followup on this in new PR). I just don't really like the imports hidden in the functions (as we must do currently), and rather centrailize in a single place.
pandas/_libs/tslibs/offsets.pyx
Outdated
|
||
@property | ||
def _params(self): | ||
from pandas.errors import AbstractMethodError |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you import this at the top?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just a minor comment. ping on green.
pandas/_libs/tslibs/offsets.pyx
Outdated
@@ -313,6 +313,30 @@ class _BaseOffset(object): | |||
def __setattr__(self, name, value): | |||
raise AttributeError("DateOffset objects are immutable.") | |||
|
|||
def __eq__(self, other): | |||
if is_string_object(other): | |||
from pandas.tseries.frequencies import to_offset |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no then then its a function which we can import at the top (and of course if we eventually are able to actually move to_offset to cython its a big win). but its 'cleaner' I think. (certainly can followup on this in new PR). I just don't really like the imports hidden in the functions (as we must do currently), and rather centrailize in a single place.
pandas/tseries/offsets.py
Outdated
@@ -182,6 +182,8 @@ def __add__(date): | |||
|
|||
Since 0 is a bit weird, we suggest avoiding its use. | |||
""" | |||
_params = cache_readonly(BaseOffset._params.fget) | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you remove this blank line
pandas/_libs/tslibs/offsets.pyx
Outdated
|
||
@property # NB: non-cython subclasses override with cache_readonly | ||
def _params(self): | ||
all_paras = self.__dict__.copy() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you give an expl of what this is doing (and you can move the comment inside the function too)
thanks @jbrockmendel ! |
Following #21582, the biggest avoidable overheads in Period comparisons are in a)
isinstance
calls that can be short-circuited and b)__ne__
call overhead. This PR moves__eq__
and__ne__
to the cython file, removing the__ne__
overhead, and changes anisinstance
check to a try/except.Using the same profile code from #21582, this brings the
set_index
runtime from 6.603 seconds down to 2.165 seconds.