-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: Fix bug in SeriesGroupBy.value_counts when DataFrame has one row (#42618) #42640
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: Fix bug in SeriesGroupBy.value_counts when DataFrame has one row (#42618) #42640
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! Code change lgtm, some minor requests. As you identified (thanks for that!), this is a regression so should be fixed in 1.3.1 if we can get it in (otherwise, will be 1.3.2).
doc/source/whatsnew/v1.4.0.rst
Outdated
@@ -258,7 +258,7 @@ Groupby/resample/rolling | |||
^^^^^^^^^^^^^^^^^^^^^^^^ | |||
- Fixed bug in :meth:`SeriesGroupBy.apply` where passing an unrecognized string argument failed to raise ``TypeError`` when the underlying ``Series`` is empty (:issue:`42021`) | |||
- Bug in :meth:`Series.rolling.apply`, :meth:`DataFrame.rolling.apply`, :meth:`Series.expanding.apply` and :meth:`DataFrame.expanding.apply` with ``engine="numba"`` where ``*args`` were being cached with the user passed function (:issue:`42287`) | |||
- | |||
- Fixed bug in :meth:`SeriesGroupBy.value_counts` when the DataFrame/Series you are grouping has one row (:issue:`42618`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only Series, I think. Even if you start with a DataFrame/DataFrameGroupBy object and then subset, this method is only acting on a Series.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, move to 1.3.1
result = dfg["B"].value_counts() | ||
expected = df.value_counts() | ||
|
||
tm.assert_series_equal(result, expected, check_names=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you rename to the expected value instead of not checking
|
||
tm.assert_series_equal(result, expected, check_names=False) | ||
|
||
df = DataFrame([[1, 2, 3]], columns=["A", "B", "C"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Parametrize the test instead, e.g.
pytest.mark.parametrize("columns", [["A", "B"], ["A", "B", "C"]])
then use data=range(len(columns))
, and groupby(columns)[:-1]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clicked the wrong button... :)
8206ca7
to
0a10eec
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
…nts when DataFrame has one row (pandas-dev#42618)
…ataFrame has one row (#42618) (#42696) Co-authored-by: neelmraman <[email protected]>
Was introduced in v1.3.0 here. I couldn't find a reason why
len(lchanges)
should be used and the unit tests still passed withlen(val)
instead.