Skip to content

BUG: Fix issue with apply on empty DataFrame #28213

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 22 commits into from
Sep 20, 2019
11 changes: 7 additions & 4 deletions pandas/core/apply.py
Original file line number Diff line number Diff line change
Expand Up @@ -204,16 +204,19 @@ def apply_empty_result(self):
from pandas import Series

if not reduce:

EMPTY_SERIES = Series([])
try:
r = self.f(EMPTY_SERIES, *self.args, **self.kwds)
r = self.f(Series([]))
reduce = not isinstance(r, Series)
except Exception:
pass

if reduce:
return self.obj._constructor_sliced(np.nan, index=self.agg_axis)
if len(self.agg_axis):
r = self.f(Series([]))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pass args & kwargs

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we already are trying to reduce above (line 208), why are you calling the function again? does this hit the Except?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did that in case reduce was already True at line 206, so that the try block wouldn't have been executed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i am still puzzled why you can not pass args/kwargs

Copy link
Member Author

@dsaxton dsaxton Sep 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what's happening is the function self.f is getting curried around here:

if (kwds or args) and not isinstance(func, (np.ufunc, str)):
so when we pass the arguments again we get an error (this is from the df.nunique() test):

>           r = self.f(Series([]), *self.args, **self.kwds)
E           TypeError: f() got an unexpected keyword argument 'dropna'

because at that point f only takes a single argument. I could imagine there could end up being a problem if this currying doesn't happen, so there's probably a hidden corner case that just isn't covered by the existing tests.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was confused by this to but I think this makes sense. I suppose this was hitting the except before due to a TypeError for wrong number of arguments?

else:
r = np.nan

return self.obj._constructor_sliced(r, index=self.agg_axis)
else:
return self.obj.copy()

Expand Down
24 changes: 24 additions & 0 deletions pandas/tests/frame/test_apply.py
Original file line number Diff line number Diff line change
Expand Up @@ -116,6 +116,30 @@ def test_apply_with_reduce_empty(self):
# Ensure that x.append hasn't been called
assert x == []

def test_apply_funcs_over_empty(self):
# GH 28213
df = DataFrame(columns=["a", "b", "c"])

result = df.apply(np.sum)
expected = df.sum()
assert_series_equal(result, expected)

result = df.apply(np.prod)
expected = df.prod()
assert_series_equal(result, expected)

result = df.apply(np.any)
expected = df.any()
assert_series_equal(result, expected)

result = df.apply(np.all)
expected = df.all()
assert_series_equal(result, expected)

result = df.nunique()
expected = Series(0, index=df.columns)
assert_series_equal(result, expected)

def test_apply_deprecate_reduce(self):
empty_frame = DataFrame()

Expand Down