-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
[BUG] Fixed behavior of DataFrameGroupBy.apply to respect _group_selection_context #29131
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
78de38c
63677e7
1d99d9c
98bc673
fbf3202
947a5bd
8c3efb0
fa21e29
a0a9aa5
7070169
76815f1
6c49a16
8a4c1f8
ccf940d
b7d056d
cfacfc1
c384c09
91d1931
83be029
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -398,6 +398,81 @@ keywords. | |
|
||
df.rename(index={0: 1}, columns={0: 2}) | ||
|
||
|
||
.. _whatsnew_1000.api_breaking.GroupBy.apply: | ||
|
||
``GroupBy.apply`` behaves consistently with `as_index` | ||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
||
- Previously, the result of :meth:`GroupBy.apply` sometimes contained the grouper column(s), | ||
in both the index, and in the `DataFrame`. :meth:`GroupBy.apply` | ||
now respects the ``as_index`` parameter, and only returns the grouper column(s) in | ||
the result if ``as_index`` is set to `False`. Other methods such as :meth:`GroupBy.resample` | ||
exhibited similar behavior and now also respect the ``as_index`` parameter. | ||
|
||
*Previous Behavior* | ||
|
||
.. code-block:: ipython | ||
|
||
In [1]: df = pd.DataFrame({"a": [1, 1, 2, 2, 3, 3], "b": [1, 2, 3, 4, 5, 6]}) | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. show df here |
||
In [2]: df.groupby("a").apply(lambda x: x.sum()) | ||
Out[2]: | ||
a b | ||
a | ||
1 2 3 | ||
2 4 7 | ||
3 6 11 | ||
|
||
In [3]: df.groupby("a").apply(lambda x: x.iloc[0]) | ||
Out[3]: | ||
a b | ||
a | ||
1 1 1 | ||
2 2 3 | ||
3 3 5 | ||
|
||
In [4]: idx = pd.date_range('1/1/2000', periods=4, freq='T') | ||
|
||
In [5]: df = pd.DataFrame(data=4 * [range(2)], | ||
...: index=idx, | ||
...: columns=['a', 'b']) | ||
|
||
In [6]: df.iloc[2, 0] = 5 | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. show df |
||
In [7]: df.groupby('a').resample('M').sum() | ||
Out[7]: | ||
a b | ||
a | ||
0 2000-01-31 0 3 | ||
5 2000-01-31 5 1 | ||
|
||
|
||
*Current Behavior* | ||
|
||
.. ipython:: python | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. break this up into 2 or more examples, its just too hard to follow like this. meaning: change1 change2 |
||
|
||
df = pd.DataFrame({"a": [1, 1, 2, 2, 3, 3], "b": [1, 2, 3, 4, 5, 6]}) | ||
df.groupby("a").apply(lambda x: x.sum()) | ||
df.groupby("a").apply(lambda x: x.iloc[0]) | ||
idx = pd.date_range('1/1/2000', periods=4, freq='T') | ||
df = pd.DataFrame(data=4 * [range(2)], | ||
index=idx, | ||
columns=['a', 'b']) | ||
df.iloc[2, 0] = 5 | ||
df.groupby('a').resample('M').sum() | ||
|
||
|
||
All :class:`SeriesGroupBy` aggregation methods now respect the ``observed`` keyword | ||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
The following methods now also correctly output values for unobserved categories when called through ``groupby(..., observed=False)`` (:issue:`17605`) | ||
|
||
- :meth:`SeriesGroupBy.count` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. where are these tested? can you do this as a separate change? |
||
- :meth:`SeriesGroupBy.size` | ||
- :meth:`SeriesGroupBy.nunique` | ||
- :meth:`SeriesGroupBy.nth` | ||
|
||
|
||
Extended verbose info output for :class:`~pandas.DataFrame` | ||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need to move to 1.1