Skip to content

Commit 20dd124

Browse files
committed
Avoided looping and multiple np calls, added api breaking section to whatsnew
1 parent f799b42 commit 20dd124

File tree

2 files changed

+42
-9
lines changed

2 files changed

+42
-9
lines changed

doc/source/whatsnew/v1.1.0.rst

+29
Original file line numberDiff line numberDiff line change
@@ -380,6 +380,35 @@ Assignment to multiple columns of a :class:`DataFrame` when some of the columns
380380
df[['a', 'c']] = 1
381381
df
382382
383+
.. _whatsnew_110.api_breaking.as_index_false_with_std_and_sem:
384+
385+
:meth:`DataFrameGroupby.std` and :meth:`DataFrameGroupby.sem` preserve group keys when ``as_index=False``
386+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
387+
388+
Using :meth:`DataFrameGroupby.std` and :meth:`DataFrameGroupby.sem` would previously alter the group keys when ``as_index=False``. Now, they are correctly left as the group keys. (:issue:`10355`)
389+
390+
.. ipython:: python
391+
392+
df = pd.DataFrame({"a": [0, 0, 1, 1, 2, 2], "b": [1, 1, 2, 3, 5, 8]})
393+
df
394+
395+
*Previous behavior*:
396+
397+
.. code-block:: ipython
398+
399+
In [3]: df.groupby("a", as_index=False).std()
400+
Out[3]:
401+
a b
402+
0 0.000000 0.000000
403+
1 1.000000 0.707107
404+
2 1.414214 2.121320
405+
406+
*New behavior*:
407+
408+
.. ipython:: python
409+
410+
df.groupby("a", as_index=False).std()
411+
383412
.. _whatsnew_110.deprecations:
384413

385414
Deprecations

pandas/core/groupby/groupby.py

+13-9
Original file line numberDiff line numberDiff line change
@@ -1271,17 +1271,19 @@ def std(self, ddof: int = 1):
12711271
Degrees of freedom.
12721272
12731273
Returns
1274-
-------
1274+
-------s
12751275
Series or DataFrame
12761276
Standard deviation of values within each group.
12771277
"""
12781278
result = self.var(ddof=ddof)
12791279
if result.ndim == 1:
12801280
result = np.sqrt(result)
12811281
else:
1282-
for col in result:
1283-
if col not in self.exclusions:
1284-
result[col] = np.sqrt(result[col])
1282+
cols = result.columns.get_indexer_for(
1283+
result.columns.difference(self.exclusions).unique()
1284+
)
1285+
result.iloc[:, cols] = np.sqrt(result.iloc[:, cols]).values
1286+
12851287
return result
12861288

12871289
@Substitution(name="groupby")
@@ -1331,13 +1333,15 @@ def sem(self, ddof: int = 1):
13311333
Standard error of the mean of values within each group.
13321334
"""
13331335
result = self.std(ddof=ddof)
1334-
denom = np.sqrt(self.count())
13351336
if result.ndim == 1:
1336-
result /= denom
1337+
result /= np.sqrt(self.count())
13371338
else:
1338-
for col in result:
1339-
if col not in self.exclusions:
1340-
result[col] /= denom[col]
1339+
cols = result.columns.get_indexer_for(
1340+
result.columns.difference(self.exclusions).unique()
1341+
)
1342+
result.iloc[:, cols] = (
1343+
result.iloc[:, cols].values / np.sqrt(self.count().iloc[:, cols]).values
1344+
)
13411345
return result
13421346

13431347
@Substitution(name="groupby")

0 commit comments

Comments
 (0)