Skip to content

Allow sequence in groupby level #837

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Dec 23, 2023

Conversation

jens-diewald
Copy link
Contributor

  • Closes Level parameter of DataFrame.groupby should allow Sequences #836
  • Tests added:
    • One for series and one for DataFrame. The test case is pretty much the one from the Issue, adapted to the style of the surrounding code and I added a third row, so something actually would be grouped.
    • The template comment tells me to add assert_type() to assert the type of any return value. I do not really see value in checking any return type here, though?

This test fail currently, as the level parameter
currently does not accept any sequences.
@Dr-Irv
Copy link
Collaborator

Dr-Irv commented Dec 22, 2023

  • The template comment tells me to add assert_type() to assert the type of any return value. I do not really see value in checking any return type here, though?

We are doing this for consistency purposes. There are still tests that have to adopt that convention (they are close to 2 years old), but we are using the convention on any new tests that are added. So if you could add the check(assert_type(...)) pattern into the tests, we'd maintain that consistency.

Copy link
Collaborator

@Dr-Irv Dr-Irv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all looks good. As mentioned elsewhere, convert the tests to use the check(assert_type(...)) pattern, and it should be good to go.

@jens-diewald
Copy link
Contributor Author

Is it fine like that?
Or should is split up the command and test the return type of groupby?

Copy link
Collaborator

@Dr-Irv Dr-Irv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small change in the tests. I just ran them - failure is due to an odd MacOS issue that is intermittent, so once you make the change, I will run again, and if it passes, approve/merge the PR.

Thanks!

@@ -1082,7 +1082,10 @@ def test_types_groupby_level() -> None:
"col4": [1, 2, 3],
}
df = pd.DataFrame(data=data).set_index(["col1", "col2", "col3"])
df.groupby(level=["col1", "col2"]).sum()
check(
assert_type(df.groupby(level=["col1", "col2"]).sum(), "pd.DataFrame"),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
assert_type(df.groupby(level=["col1", "col2"]).sum(), "pd.DataFrame"),
assert_type(df.groupby(level=["col1", "col2"]).sum(), pd.DataFrame),

You don't need the quotes here because the type is known at runtime.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, I did the change.
Thanks for the feedback.

@@ -445,7 +445,11 @@ def test_types_groupby_level() -> None:
[(0, 0, 1), (0, 1, 2), (0, 0, 3)], names=["col1", "col2", "col3"]
)
s = pd.Series([1, 2, 3], index=index)
s.groupby(level=["col1", "col2"]).sum()
check(
assert_type(s.groupby(level=["col1", "col2"]).sum(), "pd.Series[int]"),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You do need the quotes here around pd.Series[int], because the generic version is not known at runtime.

Copy link
Collaborator

@Dr-Irv Dr-Irv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @jens-diewald

@Dr-Irv Dr-Irv merged commit 068679c into pandas-dev:main Dec 23, 2023
twoertwein pushed a commit to twoertwein/pandas-stubs that referenced this pull request Dec 24, 2023
* Add test cases for level sequences

This test fail currently, as the level parameter
currently does not accept any sequences.

* Allow sequences for groupby level parameter

This fixes pandas-dev#836

* Add assert_type

* Remove unnecessary quotes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Level parameter of DataFrame.groupby should allow Sequences
2 participants