Skip to content

BUG: Fix bug in SeriesGroupBy.value_counts when DataFrame has one row (#42618) #42640

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Jul 24, 2021

Conversation

neelmraman
Copy link
Contributor

@neelmraman neelmraman commented Jul 20, 2021

Was introduced in v1.3.0 here. I couldn't find a reason why len(lchanges) should be used and the unit tests still passed with len(val) instead.

@rhshadrach rhshadrach added the Regression Functionality that used to work in a prior pandas version label Jul 23, 2021
Copy link
Member

@rhshadrach rhshadrach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! Code change lgtm, some minor requests. As you identified (thanks for that!), this is a regression so should be fixed in 1.3.1 if we can get it in (otherwise, will be 1.3.2).

@@ -258,7 +258,7 @@ Groupby/resample/rolling
^^^^^^^^^^^^^^^^^^^^^^^^
- Fixed bug in :meth:`SeriesGroupBy.apply` where passing an unrecognized string argument failed to raise ``TypeError`` when the underlying ``Series`` is empty (:issue:`42021`)
- Bug in :meth:`Series.rolling.apply`, :meth:`DataFrame.rolling.apply`, :meth:`Series.expanding.apply` and :meth:`DataFrame.expanding.apply` with ``engine="numba"`` where ``*args`` were being cached with the user passed function (:issue:`42287`)
-
- Fixed bug in :meth:`SeriesGroupBy.value_counts` when the DataFrame/Series you are grouping has one row (:issue:`42618`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only Series, I think. Even if you start with a DataFrame/DataFrameGroupBy object and then subset, this method is only acting on a Series.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, move to 1.3.1

result = dfg["B"].value_counts()
expected = df.value_counts()

tm.assert_series_equal(result, expected, check_names=False)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you rename to the expected value instead of not checking


tm.assert_series_equal(result, expected, check_names=False)

df = DataFrame([[1, 2, 3]], columns=["A", "B", "C"])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parametrize the test instead, e.g.

pytest.mark.parametrize("columns", [["A", "B"], ["A", "B", "C"]])

then use data=range(len(columns)), and groupby(columns)[:-1]

Copy link
Member

@rhshadrach rhshadrach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clicked the wrong button... :)

@simonjayhawkins simonjayhawkins added this to the 1.3.1 milestone Jul 23, 2021
@neelmraman neelmraman force-pushed the groupby_value_counts_42618 branch from 8206ca7 to 0a10eec Compare July 24, 2021 02:04
Copy link
Member

@rhshadrach rhshadrach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Groupby Regression Functionality that used to work in a prior pandas version
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: SeriesGroupBy.value_counts() throws IndexError if there is only one group
3 participants