-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
DOC: update groupby NA group handing / workaround #5456
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
you have None in your groups, which are dropped, see here if you add The way to 'solve' this problem is to fill the groups with a string, group, perform your operation, then if you really-really want a |
That's good. So it's not a bug and everything is much easier. To remaining points: Thanks a lot for your quick answer, jreback. |
ok...will convert this issue to a doc updating one then...thanks for the comments |
I'm adding something to this - just to bring this up the list. So what exactly has to be done - there needs to be a Doc change to the docs itself or the docstring as well? |
would add an example of how to work around (like the above), here |
As described in #47337 (review), there is |
Add more explicit docs / work-around for dealing with groupby and NA groups
(see comments)
Changelog: 07.Nov.2013: Add line to example below to preprocess table content.
I expect the following behavior: A
DataFrame.groupby
splits the dataframe/table into subtables according to the grouping-condition. A column name as a grouping-condition will give me subtables for each individual value in that column. Similarly, grouping with multiple columns (a list of column names) gives me a group for each occurring combination of these columns (or let me put it differently, the unique "values" of multiple columns to group for are tuples).So if I'm wrong with my expectations, I couldn't read a different meaning or to-expect-behavior from the documentation (e.g.
pandas.DataFrame.groupby.__doc__
), then there is a lake of clarification.Otherwise I found a bug and I am in the need for a fix: Some existing combinations are not provided with a group or splited subtable -- I checked it with
drop_duplicates
. And, finally,grouped.__iter__
ignores more/other combinations asgrouped.groups.keys()
-- Here, I also would expect, that both follows the same implementation...I tracked it to the depth of pandas to
pandas.core.Grouper._get_group_keys
or better_KeyMapper.get_key
,self.levels
looks good, but the list-comprehension-getmethod-zip-action goes wrong or eventuallypandas.core.Grouper.group_info
provides a too smallngroups
value oorr something else.pandas.__version__
: 0.12.0-1062-g3c57949 (from 6.11.2013)numpy.__version__
: 1.7.2MacOSX 10.9
Test Example:
The text was updated successfully, but these errors were encountered: