Skip to content

CLN: get parts of Block.replace out of try/except #27408

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Jul 25, 2019
Merged
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 19 additions & 7 deletions pandas/core/internals/blocks.py
Original file line number Diff line number Diff line change
Expand Up @@ -733,6 +733,13 @@ def _try_coerce_args(self, other):
type(self).__name__.lower().replace("Block", ""),
)
)
if lib.is_scalar(other) and isna(other) and self.is_integer:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems way too specific. where is the above expression (np.any(...)) and not self._can_hold_element(other)

not true?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because its looking at NAs in the opposite way. The check above is for notna, whereas here we want to specifically catch np.nan

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no problem with the is_scalar and isna, its the is_integer which is way too specific here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally agree that it shouldn't be necessary. In an earlier draft I had not self._can_hold_na instead of self.is_integer, but then BoolBlock started breaking. I had planned to revisit this in another step, but would also be OK with trying to get this to the better not self._can_hold_na in this step too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, yes I think if you used _can_hold_na here would be ok with this change.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK I was able to make this prettier, but at the cost of making a check in Block.where uglier. I think this is going to continue to be whack-a-mole until we can finish getting rid of the "try"s

raise TypeError(
"cannot convert {} to an {}".format(
type(other).__name__,
type(self).__name__.lower().replace("Block", ""),
)
)

return other

Expand Down Expand Up @@ -781,16 +788,15 @@ def replace(
inplace = validate_bool_kwarg(inplace, "inplace")
original_to_replace = to_replace

# try to replace, if we raise an error, convert to ObjectBlock and
# If we cannot replace with own dtype, convert to ObjectBlock and
# retry
values = self._coerce_values(self.values)
try:
to_replace = self._try_coerce_args(to_replace)
except (TypeError, ValueError):
if not self._can_hold_element(to_replace):
# TODO: we should be able to infer at this point that there is
# nothing to replace
# GH 22083, TypeError or ValueError occurred within error handling
# causes infinite loop. Cast and retry only if not objectblock.
if is_object_dtype(self):
raise
raise AssertionError
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this makes this block different than all others, why?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the status quo, raise re-raises the exception that is caught on 789. But now we don't have that to re-raise, so we need to raise a new exception. The raise on 793 is not reached (and should not be reachable)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why don't we instead define an internal exception, say DtypeInvalidException (or better name) which inherits this then for flow control.


# try again with a compatible block
block = self.astype(object)
Expand All @@ -803,6 +809,9 @@ def replace(
convert=convert,
)

values = self._coerce_values(self.values)
to_replace = self._try_coerce_args(to_replace)

mask = missing.mask_missing(values, to_replace)
if filter is not None:
filtered_out = ~self.mgr_locs.isin(filter)
Expand Down Expand Up @@ -1401,7 +1410,10 @@ def where(self, other, cond, align=True, errors="raise", try_cast=False, axis=0)

# our where function
def func(cond, values, other):
other = self._try_coerce_args(other)

if not (self.is_integer and lib.is_scalar(other) and np.isnan(other)):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems way too specific here (the is_integer check)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, at first had not self._can_hold_na, but it turns out that breaks for BoolBlock

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same issue with the is_integer

# np.where will cast integer array to floats in this case
other = self._try_coerce_args(other)

try:
fastres = expressions.where(cond, values, other)
Expand Down