-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: Same function calls on the same DataFrameGroupBy object give different results #34271
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
come across the same question as yours |
Confirmed on Ubuntu 18.04, Python 3.8.2, Pandas 1.0.3.
The issue lies in function "first()". It works inplace and modifies the GroubBy object removing the "group" column. Same happens with the function "nth". I can take this if this is a bug and not by design. |
Along with "first()", this seems to be an issue with the following functions as well: pandas/pandas/core/groupby/groupby.py Lines 1511 to 1584 in a087f3e
|
The error occurs in the reset_cache call at : pandas/pandas/core/groupby/groupby.py Line 675 in a087f3e
which is a call to: Line 67 in a087f3e
This seems to be too advanced for me to work on. |
take |
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
(optional) I have confirmed this bug exists on the master branch of pandas.
Source codes
Problem description
In above codes, same function calls
grps.apply(lambda x: x.shape[1]).unique()
give different results:In the first 2 times before
grps.first().shape
is called, it returns 4.While after
grps.first().shape
is called, it returns 3.Running output
Expected Output
Environments
Python 3.7.7, Ubuntu 18.04.
Output of
pd.show_versions()
:The text was updated successfully, but these errors were encountered: