-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
PERF: Using _Period for optimization #32973
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
pandas/_libs/index.pyx
Outdated
@@ -468,9 +467,8 @@ cdef class PeriodEngine(Int64Engine): | |||
cdef int64_t _unbox_scalar(self, scalar) except? -1: | |||
if scalar is NaT: | |||
return scalar.value | |||
if isinstance(scalar, Period): | |||
if isinstance(scalar, periodlib._Period): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you would need to cimport _Period to get the performance improvement
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just so I don't do anything stupid, I need to create the file period.pxd
at pandas/_libs/tslibs/
, correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep
ASV benchmarks:
|
ok so not much improvements. @jbrockmendel |
@MomIsBestFriend since the improvements are very small, two questions to ask:
|
Yes the
@jbrockmendel What do you mean by "build size"? |
Are the "ratio" numbers consistent? I'm looking to be able to make a concrete statement to the effect of "this makes a small but non-zero improvement in X"
Roughly i mean the size in bytes of the wheel we end up distributing. Could alternatively measure the memory footprint of the imported modules, but thats trickier. |
Not strongly against this, but a slight preference to just remove the TODO; this doesn't seem to have a material impact so I don't think worth any added complexity |
Yes the numbers are consistent, unfortunately you can't say that there is an improvement.
There is a size difference which is only a 4 bytes (PR is smaller) |
Agreed |
agree here, @MomIsBestFriend if you can just remove the TODO and drop the other changes can accept this PR. |
thanks @MomIsBestFriend |
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff
I did two bench marks one in a tight loop(
%timeit
), and one with asv.I don't see a significant performance change (but I'm not sure I read the output correctly), maybe worth just removing the "TODO" note.
Benchmarks:
Timeit:
ASV (period.Indexing)