Skip to content

BUG: groupby apply on selected columns yielding scalar (GH13568) #13585

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/source/whatsnew/v0.19.0.txt
Original file line number Diff line number Diff line change
Expand Up @@ -491,6 +491,7 @@ Bug Fixes
- Bug in ``PeriodIndex`` construction returning a ``float64`` index in some circumstances (:issue:`13067`)
- Bug in ``.resample(..)`` with a ``PeriodIndex`` not changing its ``freq`` appropriately when empty (:issue:`13067`)
- Bug in ``.resample(..)`` with a ``PeriodIndex`` not retaining its type or name with an empty ``DataFrame`` appropriately when empty (:issue:`13212`)
- Bug in ``groupby(..).apply(..)`` when the passed function returns scalar values per group (:issue:`13468`).
- Bug in ``groupby(..).resample(..)`` where passing some keywords would raise an exception (:issue:`13235`)
- Bug in ``.tz_convert`` on a tz-aware ``DateTimeIndex`` that relied on index being sorted for correct results (:issue:`13306`)
- Bug in ``.tz_localize`` with ``dateutil.tz.tzlocal`` may return incorrect result (:issue:`13583`)
Expand Down
5 changes: 4 additions & 1 deletion pandas/core/groupby.py
Original file line number Diff line number Diff line change
Expand Up @@ -3403,11 +3403,14 @@ def first_non_None_value(values):

return self._reindex_output(result)

# values are not series or array-like but scalars
else:
# only coerce dates if we find at least 1 datetime
coerce = True if any([isinstance(x, Timestamp)
for x in values]) else False
return (Series(values, index=key_index, name=self.name)
# self.name not passed through to Series as the result
# should not take the name of original selection of columns
return (Series(values, index=key_index)
._convert(datetime=True,
coerce=coerce))

Expand Down
10 changes: 10 additions & 0 deletions pandas/tests/test_groupby.py
Original file line number Diff line number Diff line change
Expand Up @@ -2584,6 +2584,16 @@ def test_apply_series_yield_constant(self):
result = self.df.groupby(['A', 'B'])['C'].apply(len)
self.assertEqual(result.index.names[:2], ('A', 'B'))

def test_apply_frame_yield_constant(self):
# GH13568
result = self.df.groupby(['A', 'B']).apply(len)
self.assertTrue(isinstance(result, Series))
self.assertIsNone(result.name)

result = self.df.groupby(['A', 'B'])[['C', 'D']].apply(len)
self.assertTrue(isinstance(result, Series))
self.assertIsNone(result.name)

def test_apply_frame_to_series(self):
grouped = self.df.groupby(['A', 'B'])
result = grouped.apply(len)
Expand Down