-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Propagate Series.name attribute when merging series into data frame #6124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@bburan-galenea pls confirm if #6265 is indeed a dupe (looks like it 2 me). pls add that example as a test if its substantially different (I didn't look). thanks |
When
There will be three groups and the Series returned for each group will be named with the index of the slice it was derived from (which will be 1, 5 and 9). Since the series names are not consistent, the proposed solution in the PR (#6068) will fail since it checks to see if the series names are consistent before merging the series into a data frame. |
@bburan-galenea ok...I think you can disambiguate that. Unfortunately groupy handles a lot of cases! |
I'm not sure what you mean by disambiguating that. There are several approaches:
Once there's agreement on which approach is best, I can implement it. |
go with 2 (don't name) and see what effects this has. don't want 1 as will break compat (or does it?) 3 - too many keywords already.. :) |
Thanks! 1 will probably break compatibility in someone's code (I can think of a few cases in some old analyses I've done where I might have done something similar with group/apply/iloc). So, I will go with 2. |
See #6068
Use case
Facilitate DataFrame group/apply transformations when using a function that returns a Series. Right now, if we perform the following:
We get the following output:
Ideally, the series name should be preserved and propagated through these operations such that we get the following output:
The only way to achieve this (currently) is:
However, the key issue here is 1) this adds an extra line of code and 2) the name of the series created in the applied function may not be known in the outside block (so we can't properly fix the result.columns.name attribute).
The other work-around is to name the index of the series:
During the group/apply operation, one approach is to check to see whether series.index has the name attribute set. If the name attribute is not set, it will set the index.name attribute to the name of the series (thus ensuring the name propagates).
The text was updated successfully, but these errors were encountered: