-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
PERF: get_block_type #52109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PERF: get_block_type #52109
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
pandas/core/internals/blocks.py
Outdated
# than is_foo_dtype | ||
kind = dtype.kind | ||
if kind in ["M", "m"]: | ||
return DatetimeLikeBlock | ||
elif kind in ["f", "c", "i", "u", "b"]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can improve a little bit here by checking kind in "fciub"
instead of the list
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
pandas/core/internals/blocks.py
Outdated
kind = dtype.kind | ||
|
||
cls: type[Block] | ||
|
||
if isinstance(dtype, SparseDtype): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think the SparseDtype check may no longer be needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
your suggested updates give a bit of an improvement to non-EA's as well:
import numpy as np
from pandas.core.internals.blocks import get_block_type
%timeit get_block_type(np.dtype('float64'))
# 724 ns ± 59.4 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each) -> main
# 590 ns ± 30 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each) -> PR
ping on green |
green - thanks |
thanks @lukemanley |
doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.cc @jbrockmendel - this may partly close #48212, however, I suspect the OP was referring to non-EA's given the old version of pandas.
Performance improvement is mostly for EA's where the
.kind
call can be a bottleneck.