Skip to content

Commit 5bdcb40

Browse files
mroeschketopper-123
authored andcommitted
BUG: groupby.apply raising a TypeError when __getitem__ selects multlple columns (pandas-dev#53207)
* BUG: groupby.apply raising a TypeError when __getitem__ selects multiple columns * Fix whatsnew * typing * Typing
1 parent 2f842b6 commit 5bdcb40

File tree

3 files changed

+24
-4
lines changed

3 files changed

+24
-4
lines changed

doc/source/whatsnew/v2.1.0.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -449,10 +449,10 @@ Groupby/resample/rolling
449449
or :class:`PeriodIndex`, and the ``groupby`` method was given a function as its first argument,
450450
the function operated on the whole index rather than each element of the index. (:issue:`51979`)
451451
- Bug in :meth:`DataFrameGroupBy.apply` causing an error to be raised when the input :class:`DataFrame` was subset as a :class:`DataFrame` after groupby (``[['a']]`` and not ``['a']``) and the given callable returned :class:`Series` that were not all indexed the same. (:issue:`52444`)
452+
- Bug in :meth:`DataFrameGroupBy.apply` raising a ``TypeError`` when selecting multiple columns and providing a function that returns ``np.ndarray`` results (:issue:`18930`)
452453
- Bug in :meth:`GroupBy.groups` with a datetime key in conjunction with another key produced incorrect number of group keys (:issue:`51158`)
453454
- Bug in :meth:`GroupBy.quantile` may implicitly sort the result index with ``sort=False`` (:issue:`53009`)
454455
- Bug in :meth:`GroupBy.var` failing to raise ``TypeError`` when called with datetime64, timedelta64 or :class:`PeriodDtype` values (:issue:`52128`, :issue:`53045`)
455-
-
456456

457457
Reshaping
458458
^^^^^^^^^

pandas/core/groupby/generic.py

+11-3
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,7 @@
5151
CategoricalDtype,
5252
IntervalDtype,
5353
)
54+
from pandas.core.dtypes.inference import is_hashable
5455
from pandas.core.dtypes.missing import (
5556
isna,
5657
notna,
@@ -1540,9 +1541,16 @@ def _wrap_applied_output(
15401541
# fall through to the outer else clause
15411542
# TODO: sure this is right? we used to do this
15421543
# after raising AttributeError above
1543-
return self.obj._constructor_sliced(
1544-
values, index=key_index, name=self._selection
1545-
)
1544+
# GH 18930
1545+
if not is_hashable(self._selection):
1546+
# error: Need type annotation for "name"
1547+
name = tuple(self._selection) # type: ignore[var-annotated, arg-type]
1548+
else:
1549+
# error: Incompatible types in assignment
1550+
# (expression has type "Hashable", variable
1551+
# has type "Tuple[Any, ...]")
1552+
name = self._selection # type: ignore[assignment]
1553+
return self.obj._constructor_sliced(values, index=key_index, name=name)
15461554
elif not isinstance(first_not_none, Series):
15471555
# values are not series or array-like but scalars
15481556
# self._selection not passed through to Series as the

pandas/tests/groupby/test_apply.py

+12
Original file line numberDiff line numberDiff line change
@@ -1403,3 +1403,15 @@ def test_apply_inconsistent_output(group_col):
14031403
)
14041404

14051405
tm.assert_series_equal(result, expected)
1406+
1407+
1408+
def test_apply_array_output_multi_getitem():
1409+
# GH 18930
1410+
df = DataFrame(
1411+
{"A": {"a": 1, "b": 2}, "B": {"a": 1, "b": 2}, "C": {"a": 1, "b": 2}}
1412+
)
1413+
result = df.groupby("A")[["B", "C"]].apply(lambda x: np.array([0]))
1414+
expected = Series(
1415+
[np.array([0])] * 2, index=Index([1, 2], name="A"), name=("B", "C")
1416+
)
1417+
tm.assert_series_equal(result, expected)

0 commit comments

Comments
 (0)