-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: DataFrame with Int64 columns casts to float64 with .max()/.min() #34210
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 8 commits
9728d81
7765bac
c58e85d
af366b8
4a14032
e4e6c8e
9b5f41c
e704f51
60ab03a
7b18d89
e850108
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -8419,7 +8419,9 @@ def _get_data(axis_matters): | |
raise NotImplementedError(msg) | ||
return data | ||
|
||
if numeric_only is not None and axis in [0, 1]: | ||
is_numeric = all(b.is_numeric for b in self._mgr.blocks) | ||
|
||
if (is_numeric or numeric_only is not None) and axis is not None: | ||
df = self | ||
if numeric_only is True: | ||
df = _get_data(axis_matters=True) | ||
|
@@ -8441,6 +8443,10 @@ def blk_func(values): | |
assert isinstance(res, dict) | ||
if len(res): | ||
assert len(res) == max(list(res.keys())) + 1, res.keys() | ||
elif not out_dtype: | ||
# The default dtype for empty Series will be 'object' instead of | ||
# 'float64' in a future version. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. do we have tests which currently hit this case? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this was added because of test failures in tests/frame/test_analytics.py (maybe others)
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
so I guess on master, no tests hit this case There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. these cases were for empty DataFrames (after removing timestamp columns). using all for is_numeric was returning True for empty iterable (_mgr.blocks). changing is_numeric and this elif not out_dtype code block doesn't need to be added. |
||
out_dtype = "float64" | ||
out = df._constructor_sliced(res, index=range(len(res)), dtype=out_dtype) | ||
out.index = df.columns | ||
if axis == 0 and is_object_dtype(out.dtype): | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do the added tests hit this sufficiently (e.g. some blocks numeric some blocks not)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, but that is another issue #34520