Skip to content

BUG: DataFrame reductions with object dtype and axis=1 #50224

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

rhshadrach
Copy link
Member

In #49616 I attempted to only remove the errant cast to float, but this lead to the issue that, e.g. for axis=1 the sum of boolean columns becomes object. Unconditionally taking the block manager reduce path solves this problem, but then we run into the issue of taking a transpose of a frame with no columns (see the comments in the diff here). This PR resolves that issue, but is a bit of a hack (I can't seem to find any other resolution).

In the test here, there are some reductions (e.g. mean) that one may have expected float back instead - I've opened #49618 for this.

@rhshadrach rhshadrach added Dtype Conversions Unexpected or buggy dtype conversions DataFrame DataFrame data structure Reduction Operations sum, mean, min, max, etc. labels Dec 13, 2022
@rhshadrach rhshadrach marked this pull request as draft December 14, 2022 17:16
Comment on lines 727 to 745
expected_dtype = {
"any": "bool",
"all": "bool",
"count": "int64",
"sum": "float",
"prod": "float",
"skew": "float",
"kurt": "float",
"sem": "float",
}.get(all_reductions, "object")
if using_array_manager and all_reductions in (
"max",
"min",
"mean",
"std",
"var",
"median",
):
expected_dtype = "float"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On main, there is no behavior change for array manager in this PR. For block manager, the reducers on L738-743 return float type on main but object type here.

@github-actions
Copy link
Contributor

This pull request is stale because it has been open for thirty days with no activity. Please update and respond to this comment if you're still interested in working on this.

@rhshadrach
Copy link
Member Author

Closing in favor of #51335

@rhshadrach rhshadrach closed this Feb 11, 2023
@rhshadrach rhshadrach deleted the object_reduction_axis_1_attempt_2 branch March 2, 2023 01:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
DataFrame DataFrame data structure Dtype Conversions Unexpected or buggy dtype conversions Reduction Operations sum, mean, min, max, etc. Stale
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: DataFrame reductions with object dtype and axis=1
1 participant