REGR: 1.3 invalid exclusion of nuisance columns with groupby aggregation #43380
Labels
Bug
Docs
Duplicate Report
Duplicate issue or pull request
Groupby
Nuisance Columns
Identifying/Dropping nuisance columns in reductions, groupby.add, DataFrame.apply
Reduction Operations
sum, mean, min, max, etc.
Regression
Functionality that used to work in a prior pandas version
Milestone
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
(optional) I have confirmed this bug exists on the master branch of pandas.
Code Sample, a copy-pastable example
Adapted from "Automatic exclusion of nuisance columns" in the User Guide "Group by" docs:
Problem description
The User Guide "Group by" docs provides a code example that shows when nuisance columns will be excluded from aggregation. According to this doc the case:
should produce a valid aggregation, but for pandas >= 1.3.0 it results in an empty dataframe.
The impact of the regression can even be seen in the published docs. If we look an archive.org 2021-02-25 snapshot of the "Automatic exclusion of nuisance columns" section, we can see that the example produces correct output (see
Out[170]
in the code example):https://web.archive.org/web/20210225195813/https://pandas.pydata.org/docs/user_guide/groupby.html#automatic-exclusion-of-nuisance-columns
By contrast the 2021-08-24 snapshot displays an empty dataframe for the
Out[170]
example:https://web.archive.org/web/20210824151314/https://pandas.pydata.org/docs/user_guide/groupby.html#automatic-exclusion-of-nuisance-columns
However, note that the docs in both cases indicate that the example should produce a correct aggregation.
Expected Output
should produce the result:
For pandas 1.2.x it does so as expected. For pandas >= 1.3.0 it produces instead the incorrect
The text was updated successfully, but these errors were encountered: