-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ENH Consistent apply output when grouping with freq #12362
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH Consistent apply output when grouping with freq #12362
Conversation
tests! |
I wanted to make sure it wouldn't be outright rejected first. :-) I'll write tests now. |
69dc772
to
5571ac7
Compare
Tests. On the master branch, |
@@ -2027,7 +2009,10 @@ def names(self): | |||
@property | |||
def groupings(self): | |||
# for compat | |||
return None | |||
MockGrouping = collections.namedtuple( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
create an actual Grouping
object here rather than a mock
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, that looks a lot better. Fixed.
5571ac7
to
1ad4ebe
Compare
Can u have a look thru online groupby issues (there are a few!) and see if this has some similarity with any I recall one that had to do with the dtype changing in apply (and had to do with that code u removed) so maybe this would fix another issue |
A search through issues tagged "Groupby", or containing "TimeGrouper" and "apply", or "Grouper", "apply", and "freq" doesn't show anything that looks to me like it would be fixed by this. |
your PR might close #5839 |
i'll have to look at this more, but pls add a whatsnew |
apply
output when grouping with freq
1ad4ebe
to
1a2bdc5
Compare
I added a What's New entry. Not sure the best way to describe / categorize this -- perhaps it belongs under bugs? It doesn't look like #5839 is affected. I checked to be sure, and the examples there are unchanged by this PR. This PR only affects I scanned through the docs and didn't see any examples or instructions that would be affected by this PR. |
def sumfunc_series(x): | ||
return pd.Series([x['value'].sum()], ('sum',)) | ||
|
||
grp_out = df.groupby(pd.Grouper(key='date')).apply(sumfunc_series) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you also tests that these produce the same output when doing a .resample(..).apply(..)
(both functions)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
call this result
instead of grp_out
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed the name of grp_out
to expected, since that's the thing left unchanged by this PR.
I am not convinced this is doing the right thing. notice the return types dtypes.
|
I'm not sure if it's doing the right thing, but this PR doesn't change the output of When you suggest testing the output of
Corresponding outputs using a resampling groupby, this PR:
Outputs on master branch:
The test failure is in |
you can rebase on master; the |
c41eb84
to
046a0fb
Compare
Rebased. |
can you rebase/update (and move note to 0.18.1). and i'll take a look. |
046a0fb
to
ad087fc
Compare
@jreback , done. |
@@ -73,8 +73,7 @@ API changes | |||
|
|||
- ``CParserError`` is now a ``ValueError`` instead of just an ``Exception`` (:issue:`12551`) | |||
|
|||
|
|||
|
|||
- Using ``apply`` on resampling groupby operations now has the same output types as similar ``apply``s on other groupby operations. (:issue:`11742`, :issue:`12362`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't list the PR number, only the issue number. give some more context, like: .groupby(...).apply(...)
. I think a brief example might be waranted here (e.g. show old and new)
c5db074
to
c7b5b3e
Compare
The `BinGrouper.apply` and `BaseGrouper.apply` have different output types. To make them consistent, remove `BinGrouper.apply` and let it use the same method as the superclass `BaseGrouper`. This requires changing `BinGrouper.groupings` to return a list of `Grouping` objects (there will always only be one) instead of `None`.
c7b5b3e
to
d73f332
Compare
@jreback , I added extra detail and examples to the What's New, and rebased. Sorry for the delay. |
|
||
.. code-block:: python |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make the new an ipython block (so the code runs)
thanks @stephen-hoover some doc-comments. ping when pushed and i can look. |
3d43dde
to
3b9c4f7
Compare
3b9c4f7
to
8cf618f
Compare
@jreback , modified the docs. Let me know if this looks good. |
thanks @stephen-hoover |
The
BinGrouper.apply
(used by theTimeGrouper
) andBaseGrouper.apply
(used by theGrouper
) have different output types. To make them consistent, removeBinGrouper.apply
and let it use the same method as the superclassBaseGrouper
. This requires changingBinGrouper.groupings
to return an object which looks sufficiently like aGrouping
object forBaseGrouper.apply
to be happy. Use a namedtuple to create an object with the needed attributes.I think this change will only affect the output of
apply
with a custom function, and only when grouping by aTimeGrouper
(although aGrouper
with afreq
specified turns into aTimeGrouper
, so that counts).Closes #11742 .