Skip to content

TST: groupby apply called multiple times #34897

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Jun 20, 2020
21 changes: 21 additions & 0 deletions pandas/tests/groupby/test_apply.py
Original file line number Diff line number Diff line change
Expand Up @@ -974,3 +974,24 @@ def test_apply_function_with_indexing_return_column():
result = df.groupby("foo1", as_index=False).apply(lambda x: x.mean())
expected = DataFrame({"foo1": ["one", "three", "two"], "foo2": [3.0, 4.0, 4.0]})
tm.assert_frame_equal(result, expected)


def test_apply_function_called_count(capsys):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move this near test_group_apply_once_per_group and indicate the same issue number (as well)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and rename these test to test_group_apply_once_per_group2

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright @jreback . Will change it accordingly

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should remove GH: 31111 and use # GH2936, GH7739, GH10519, GH2656, GH12155, GH20084, GH21417 @jreback

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can add this issue number as well is ok

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright. fixed accordingly

# GH: 31111
# groupby-apply need to execute len(set(group_by_columns)) times

expected = 2 # Number of times `apply` should call a function for the current test

df = pd.DataFrame(
{
"group_by_column": [0, 0, 0, 0, 1, 1, 1, 1],
"test_column": ["0", "2", "4", "6", "8", "10", "12", "14"],
},
index=["0", "2", "4", "6", "8", "10", "12", "14"],
)

df.groupby("group_by_column").apply(lambda df: print("function_called"))

result = capsys.readouterr().out.count("function_called")
# If `groupby` behaves unexpectedly, this test will break
assert result == expected