Skip to content

CLN: Replace first_not_none function with default argument to next #33343

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 10, 2020
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 12 additions & 16 deletions pandas/core/groupby/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -1193,20 +1193,14 @@ def _wrap_applied_output(self, keys, values, not_indexed_same=False):

key_names = self.grouper.names

# GH12824.
def first_not_none(values):
try:
return next(com.not_none(*values))
except StopIteration:
return None

v = first_not_none(values)
# GH12824
first_not_none = next(com.not_none(*values), None)

if v is None:
if first_not_none is None:
# GH9684. If all values are None, then this will throw an error.
# We'd prefer it return an empty dataframe.
return DataFrame()
elif isinstance(v, DataFrame):
elif isinstance(first_not_none, DataFrame):
return self._concat_objects(keys, values, not_indexed_same=not_indexed_same)
elif self.grouper.groupings is not None:
if len(self.grouper.groupings) > 1:
Expand All @@ -1223,6 +1217,9 @@ def first_not_none(values):

# reorder the values
values = [values[i] for i in indexer]

# update due to the potential reorder
first_not_none = next(com.not_none(*values), None)
else:

key_index = Index(keys, name=key_names[0])
Expand All @@ -1232,20 +1229,19 @@ def first_not_none(values):
key_index = None

# make Nones an empty object
v = first_not_none(values)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it looks like there are two places where first_not_none is called, and values can change in between them

Copy link
Member Author

@rhshadrach rhshadrach Apr 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, thanks. I was thinking that it's only the kind of object being used and so the second call was not needed, but it really should be updated. I moved the update into the if block so that it is only done when needed. For what it's worth, I added an assert to see if the value order ever did change and it never got triggered by the tests.

if v is None:
if first_not_none is None:
return DataFrame()
elif isinstance(v, NDFrame):
elif isinstance(first_not_none, NDFrame):

# this is to silence a DeprecationWarning
# TODO: Remove when default dtype of empty Series is object
kwargs = v._construct_axes_dict()
if v._constructor is Series:
kwargs = first_not_none._construct_axes_dict()
if first_not_none._constructor is Series:
backup = create_series_with_explicit_dtype(
**kwargs, dtype_if_empty=object
)
else:
backup = v._constructor(**kwargs)
backup = first_not_none._constructor(**kwargs)

values = [x if (x is not None) else backup for x in values]

Expand Down