@@ -22,6 +22,8 @@ users upgrade to this version.
22
22
23
23
- :ref:`API Changes <whatsnew_0140.api>`
24
24
25
+ - :ref:`Groupby API Changes <whatsnew_0140.groupby>`
26
+
25
27
- :ref:`Performance Improvements <whatsnew_0140.performance>`
26
28
27
29
- :ref:`Prior Deprecations <whatsnew_0140.prior_deprecations>`
@@ -95,57 +97,6 @@ API changes
95
97
96
98
- Add ``is_month_start``, ``is_month_end``, ``is_quarter_start``, ``is_quarter_end``, ``is_year_start``, ``is_year_end`` accessors for ``DateTimeIndex`` / ``Timestamp`` which return a boolean array of whether the timestamp(s) are at the start/end of the month/quarter/year defined by the frequency of the ``DateTimeIndex`` / ``Timestamp`` (:issue:`4565`, :issue:`6998`)
97
99
98
- - More consistent behaviour for some groupby methods:
99
-
100
- groupby ``head`` and ``tail`` now act more like ``filter`` rather than an aggregation:
101
-
102
- .. ipython:: python
103
-
104
- df = pd.DataFrame([[1, 2], [1, 4], [5, 6]], columns=['A', 'B'])
105
- g = df.groupby('A')
106
- g.head(1) # filters DataFrame
107
-
108
- g.apply(lambda x: x.head(1)) # used to simply fall-through
109
-
110
- groupby head and tail respect column selection:
111
-
112
- .. ipython:: python
113
-
114
- g[['B']].head(1)
115
-
116
- groupby ``nth`` now filters by default, with optional dropna argument to ignore
117
- NaN (to replicate the previous behaviour.), See :ref:`the docs <groupby.nth>`.
118
-
119
- .. ipython:: python
120
-
121
- df = DataFrame([[1, np.nan], [1, 4], [5, 6]], columns=['A', 'B'])
122
- g = df.groupby('A')
123
- g.nth(0) # can also use negative ints
124
-
125
- g.nth(0, dropna='any') # similar to old behaviour
126
-
127
- groupby will now not return the grouped column for non-cython functions (:issue:`5610`, :issue:`5614`, :issue:`6732`),
128
- as its already the index
129
-
130
- .. ipython:: python
131
-
132
- df = DataFrame([[1, np.nan], [1, 4], [5, 6], [5, 8]], columns=['A', 'B'])
133
- g = df.groupby('A')
134
- g.count()
135
- g.describe()
136
-
137
- passing ``as_index`` will leave the grouped column in-place (this is not change in 0.14.0)
138
-
139
- .. ipython:: python
140
-
141
- df = DataFrame([[1, np.nan], [1, 4], [5, 6], [5, 8]], columns=['A', 'B'])
142
- g = df.groupby('A',as_index=False)
143
- g.count()
144
- g.describe()
145
-
146
- - Allow specification of a more complex groupby via ``pd.Grouper``, such as grouping
147
- by a Time and a string field simultaneously. See :ref:`the docs <groupby.specify>`. (:issue:`3794`)
148
-
149
100
- Local variable usage has changed in
150
101
:func:`pandas.eval`/:meth:`DataFrame.eval`/:meth:`DataFrame.query`
151
102
(:issue:`5987`). For the :class:`~pandas.DataFrame` methods, two things have
@@ -247,6 +198,62 @@ API changes
247
198
from 0.13.1
248
199
- Added ``factorize`` functions to ``Index`` and ``Series`` to get indexer and unique values (:issue:`7090`)
249
200
201
+ .. _whatsnew_0140.groupby:
202
+
203
+ Groupby API Changes
204
+ ~~~~~~~~~~~~~~~~~~~
205
+
206
+ More consistent behaviour for some groupby methods:
207
+
208
+ - groupby ``head`` and ``tail`` now act more like ``filter`` rather than an aggregation:
209
+
210
+ .. ipython:: python
211
+
212
+ df = pd.DataFrame([[1, 2], [1, 4], [5, 6]], columns=['A', 'B'])
213
+ g = df.groupby('A')
214
+ g.head(1) # filters DataFrame
215
+
216
+ g.apply(lambda x: x.head(1)) # used to simply fall-through
217
+
218
+ - groupby head and tail respect column selection:
219
+
220
+ .. ipython:: python
221
+
222
+ g[['B']].head(1)
223
+
224
+ - groupby ``nth`` now filters by default, with optional dropna argument to ignore
225
+ NaN (to replicate the previous behaviour.), See :ref:`the docs <groupby.nth>`.
226
+
227
+ .. ipython:: python
228
+
229
+ df = DataFrame([[1, np.nan], [1, 4], [5, 6]], columns=['A', 'B'])
230
+ g = df.groupby('A')
231
+ g.nth(0) # can also use negative ints
232
+
233
+ g.nth(0, dropna='any') # similar to old behaviour
234
+
235
+ - groupby will now not return the grouped column for non-cython functions (:issue:`5610`, :issue:`5614`, :issue:`6732`),
236
+ as its already the index
237
+
238
+ .. ipython:: python
239
+
240
+ df = DataFrame([[1, np.nan], [1, 4], [5, 6], [5, 8]], columns=['A', 'B'])
241
+ g = df.groupby('A')
242
+ g.count()
243
+ g.describe()
244
+
245
+ - passing ``as_index`` will leave the grouped column in-place (this is not change in 0.14.0)
246
+
247
+ .. ipython:: python
248
+
249
+ df = DataFrame([[1, np.nan], [1, 4], [5, 6], [5, 8]], columns=['A', 'B'])
250
+ g = df.groupby('A',as_index=False)
251
+ g.count()
252
+ g.describe()
253
+
254
+ - Allow specification of a more complex groupby via ``pd.Grouper``, such as grouping
255
+ by a Time and a string field simultaneously. See :ref:`the docs <groupby.specify>`. (:issue:`3794`)
256
+
250
257
.. _whatsnew_0140.sql:
251
258
252
259
SQL
0 commit comments