Skip to content

DEPR: scalar index for length-1-list level groupby #51817

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

august-tengland
Copy link
Contributor

@august-tengland august-tengland commented Mar 7, 2023

Previously, iterating over a Groupby object that had been initialized with a length-1-list level parameter (ex. level = [0]) would return scalar indexes as opposed to tuples with only the first element defined (ex. 0 as opposed to (0, ) ). This behavior differs from when using the by parameter and so the change is motivated.

@mroeschke mroeschke requested a review from rhshadrach March 7, 2023 23:55
@mroeschke mroeschke added Groupby Deprecate Functionality to remove in pandas labels Mar 7, 2023
Copy link
Member

@rhshadrach rhshadrach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! This PR doesn't deprecate the behavior, it changes the behavior.

We need to first warn users that the behavior will change without changing the current behavior.

@august-tengland
Copy link
Contributor Author

Thanks for the PR! This PR doesn't deprecate the behavior, it changes the behavior.

We need to first warn users that the behavior will change without changing the current behavior.

Well that's very reasonable, should have thought of that... What's the best way to go about this? Maybe just make a new PR (with an actual "depreciation") and then leave/close this branch so that it can be used for reference when the time is right?

@rhshadrach
Copy link
Member

I would suggest just modifying this PR to deprecate the behavior; as you've already demonstrated the changes to enforce that deprecation won't be extensive.

Copy link
Member

@rhshadrach rhshadrach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally looks good! Some questions / requests.

@august-tengland
Copy link
Contributor Author

I'm not sure how to resolve this failing mypy check, using is_list_like(level) together with len(level) does not cause any trouble in grouping.py. Anyone got a clue?

@rhshadrach
Copy link
Member

Looks okay to add a type ignore here. We should eventually use a type guard with is_list_like but might now be able to do that just yet.

Copy link
Member

@rhshadrach rhshadrach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Comment on lines +2712 to +2714
@pytest.mark.parametrize(
"level_arg, multiindex", [([0], False), ((0,), False), ([0], True), ((0,), True)]
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a small enough case that it is perfectly fine as-is, but wanted to mention that you can stack parameterizations and it will take the cartesian product:

@pytest.mark.parametrize("level_arg", [[0], (0,)])
@pytest.mark.parametrize("multiindex", [False, True])

@rhshadrach rhshadrach added this to the 2.1 milestone Mar 18, 2023
@rhshadrach rhshadrach added the Bug label Mar 18, 2023
@rhshadrach rhshadrach merged commit d8d1a47 into pandas-dev:main Mar 18, 2023
@rhshadrach
Copy link
Member

Thanks @august-tengland!

@august-tengland august-tengland deleted the single-element-level-argument-iteration branch March 18, 2023 20:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Deprecate Functionality to remove in pandas Groupby MultiIndex
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: Inconsistency in groupby with multi-index
3 participants