-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
groupby + shift drops group columns when as_index is False #13519
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I found the same issue with groupby.shift and groupby.apply. Depending on different arg, the result is pretty confusing: sometimes the groupby columns is dropped, sometimes it is still a column.
output is:
pd.show_versions()
|
I am having the exact same issue. Is tere any workaround? |
Ran into the same issue today. There is a workaround. If you reset_index before the groupby, then assign still works correctly after the shift. So, if you're only trying to shift one column, you can do: frm = frm.assign(shifted_val=frm.groupby('key').shift(1)['val']) Or, if you're trying to shift the whole frame, you can assign the group col back: shifted_frm = frm.groupby('key').shift(1) |
Also, not sure if anything changed since the original report, but now in v0.25, the grouping cols disappear after a groupby + shift regardless of the value of as_index. |
has there been any progress on this? I'm getting the same issue as @vladu |
@thoughtfuldata you or anyone else can solve this by contributing a pull request |
I set up a pandas dev view on my machine this morning and poked around a bit. The underlying issue is actually much more wide-spread than just the group+shift. It appears that all transforms actually drop the grouping columns. You can in fact see this in the documentation: Note that in the examples section, the frame is grouped by column 'A', and that column appears nowhere in the output. I'll poke around some more, but worried this might have wider ranging consequences than I originally anticipated. The transforms do preserve the original frame's index, so an easy workaround is to set_index first, then groupby the same columns. But that's inconsistent with agg operations, so I'll see if I can come up with a fix that isn't too extensive. |
the grouping cols disappear after a groupby + shift regardless of the value of as_index=True.
|
This is behaving as expected. shift is a transformation and so https://pandas.pydata.org/docs/user_guide/groupby.html#transformation A proposal to add functionality so that |
Using groupby + shift seems to have changed behaviour in 0.17 and 0.18 compared to 0.16.
With as_index=False, I would expect the columns that the groupby is made over to remain in the output dataframe, but they are no longer present.
Code Sample, a copy-pastable example if possible
Expected Output
output of
pd.show_versions()
I have also confirmed the issue on an similar install Linux installation using pandas 0.18.1
The text was updated successfully, but these errors were encountered: