-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: DataFrame.groupby() interprets tuple as list of keys #17996
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #17996 +/- ##
==========================================
- Coverage 91.25% 91.23% -0.02%
==========================================
Files 163 163
Lines 50115 50116 +1
==========================================
- Hits 45730 45722 -8
- Misses 4385 4394 +9
Continue to review full report at Codecov.
|
pandas/core/groupby.py
Outdated
@@ -2769,7 +2769,10 @@ def _get_grouper(obj, key=None, axis=0, level=None, sort=True, | |||
keys = [key] | |||
match_axis_length = False | |||
else: | |||
keys = key |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make this part an elif
add a comment
pandas/core/groupby.py
Outdated
if isinstance(key, tuple) and isinstance(obj._info_axis, MultiIndex): | ||
keys = [key] | ||
else: | ||
keys = key | ||
match_axis_length = len(keys) == len(group_axis) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not right the match_len axis with a tuplr
add a whatsnew for 0.21.1. if you want to try to clean up all of that if logic would be a big + |
|
doc/source/whatsnew/v0.21.0.txt
Outdated
@@ -1052,6 +1052,7 @@ Groupby/Resample/Rolling | |||
- Bug in ``DataFrame.groupby`` where a single level selection from a ``MultiIndex`` unexpectedly sorts (:issue:`17537`) | |||
- Bug in ``DataFrame.groupby`` where spurious warning is raised when ``Grouper`` object is used to override ambiguous column name (:issue:`17383`) | |||
- Bug in ``TimeGrouper`` differs when passes as a list and as a scalar (:issue:`17530`) | |||
- Bug in ``DataFrame.groupby`` where key as tuple in a ``MultiIndex`` were interpreted as a list of keys (:issue:`17979`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move to 0.22.0
pandas/core/groupby.py
Outdated
@@ -2765,7 +2765,9 @@ def _get_grouper(obj, key=None, axis=0, level=None, sort=True, | |||
elif isinstance(key, BaseGrouper): | |||
return key, [], obj | |||
|
|||
if not isinstance(key, (tuple, list)): | |||
# when MultiIndex, allow tuple to be a key | |||
if not isinstance(key, (tuple, list)) or isinstance(key, tuple) and \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use parens to avoid ambiguity when reading this condition
add an
is_axis_multiindex = isinstance(obj._info_axis, MultiIndex)
higher up and use it here (and elsewhere in this functionif you can)
lgtm, ping on green. |
8376ec9
to
afb0031
Compare
@jreback Rebased and green 👍 |
thanks @GuessWhoSamFoo keep em coming! |
closes pandas-dev#17979 Author: sfoo <[email protected]> Author: Jeff Reback <[email protected]> Closes pandas-dev#17996 from GuessWhoSamFoo/groupby_tuples and squashes the following commits: afb0031 [Jeff Reback] TST: separate out grouping-type tests c52b2a8 [sfoo] Moved notes to 0.22; created is_axis_multiindex var - pending internal use fb52c1c [sfoo] Added whatsnew; checked match_axis_length 99ebc4e [sfoo] Cast groupby tuple as list when multiindex
xref #17996 Author: Pietro Battiston <[email protected]> Closes #18249 from toobaz/groupby_tuples and squashes the following commits: dafc838 [Pietro Battiston] DOC: Clarification of groupby(by=) argument e0bdfa7 [Pietro Battiston] TST: Test for tuples in columns, fixes to previous tests 74f91e0 [Pietro Battiston] TST: Fix tests which used tuples to pass multiple keys 201a4fe [Pietro Battiston] BUG: Never interpret a tuple as a list of keys
closes pandas-dev#17979 Author: sfoo <[email protected]> Author: Jeff Reback <[email protected]> Closes pandas-dev#17996 from GuessWhoSamFoo/groupby_tuples and squashes the following commits: afb0031 [Jeff Reback] TST: separate out grouping-type tests c52b2a8 [sfoo] Moved notes to 0.22; created is_axis_multiindex var - pending internal use fb52c1c [sfoo] Added whatsnew; checked match_axis_length 99ebc4e [sfoo] Cast groupby tuple as list when multiindex
xref pandas-dev#17996 Author: Pietro Battiston <[email protected]> Closes pandas-dev#18249 from toobaz/groupby_tuples and squashes the following commits: dafc838 [Pietro Battiston] DOC: Clarification of groupby(by=) argument e0bdfa7 [Pietro Battiston] TST: Test for tuples in columns, fixes to previous tests 74f91e0 [Pietro Battiston] TST: Fix tests which used tuples to pass multiple keys 201a4fe [Pietro Battiston] BUG: Never interpret a tuple as a list of keys
git diff upstream/master -u -- "*.py" | flake8 --diff