-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
PERF: reindex default fill_value dtype #47281
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@jbrockmendel - any suggestions here? The test added in this PR is failing for certain builds. Unfortunately, I'm unable to reproduce the failures locally so I'm not sure where to start. |
@lukemanley do you happen to be using a windows machine or 32bit build locally? |
pandas/core/internals/managers.py
Outdated
dtype = interleaved_dtype([blk.dtype for blk in self.blocks]) | ||
if is_float_dtype(dtype): | ||
# GH45857 avoid unnecessary upcasting | ||
dtype = cast(np.dtype, dtype) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a risk of getting e.g. Float64Dtype here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think there is a risk in a DataFrame of only float32 values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right, im talking about a case where the DataFrame isnt all-float32, but includes at least one Float64Dtype
There's a comment in NDFrame._reindex_with_indexers |
Nope. I'm unable to reproduce those errors on mac or 64bit windows.
I'll take a look, thanks. |
I’m going to close this PR. I cannot reproduce the failures locally. I tried a few different things but was unsuccessful in identifying where the float64 dtype comes from. I do not see that behavior locally. |
Default fill_value (np.nan) to match interleaved_dtype when possible to avoid upcasting and allow for block consolidation.
Using the example from the OP:
main:
PR: