PERF: get_block_type heavy use could benefit performance improvements #48212
Labels
Closing Candidate
May be closeable, needs more eyeballs
Internals
Related to non-user accessible pandas implementation
Performance
Memory or execution speed performance
Pandas version checks
I have checked that this issue has not already been reported.
I have confirmed this issue exists on the latest version of pandas.
I have confirmed this issue exists on the main branch of pandas.
Reproducible Example
.
Installed Versions
Prior Performance
While upgrading a large pandas heavy codebase from 0.19.2, I was looking at areas to improve performance as there has been a fairly consistent ~25% performance drop across various slices of the codebase when moving to 1.1.5 (other libraries were upgraded at the same time so not necessarily just pandas contributing to that). Making blocks showed up in profiling as taking quite a lot longer, with this method being an easy place to boost performance as it is heavily used. It looks like this is the case on master as well.
Caching the results of the method produced a ~5% performance increase in quite a large test suite, but would be nice to see the change on
asv
.Generally it looks like the
is_*dtype(...)
related places might be able to benefit from attention, so I will look into those if a usable pattern comes out of this. IIR it was a extension dtype added in 2016/probably just after 0.19.2 that looked like it took a fair bit of time.The text was updated successfully, but these errors were encountered: