-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
CLN: enforce deprecation of interpolate
with object dtype
#57820
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CLN: enforce deprecation of interpolate
with object dtype
#57820
Conversation
I enforced deprecation of |
pandas/core/generic.py
Outdated
if np.any(obj.dtypes == object): | ||
# GH#53631 | ||
if not (obj.ndim == 2 and np.all(obj.dtypes == object)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you combine these conditionals? np.any
and np.all
overlap
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you for the comment. I combined two if
statement into one.
pandas/core/generic.py
Outdated
) | ||
# GH#53631 | ||
if np.any(obj.dtypes == object) and ( | ||
obj.ndim == 1 or not np.all(obj.dtypes == object) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like part of this is already handled below (if obj.ndim == 2 and np.all(obj.dtypes == object)
).
In instead nearby that check could you add a if obj.ndim == 1 and obj.dtype == object:
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you. To avoid duplicate checks I changed blocks of if
statements as you suggested.
pandas/core/generic.py
Outdated
if np.all(obj.dtypes == object): | ||
raise TypeError( | ||
"Cannot interpolate with all object-dtype columns " | ||
"in the DataFrame. Try setting at least one " | ||
"column to a numeric dtype." | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can now simplify things and remove this check. The reason: we raise
TypeError(
f"{type(self).__name__} cannot interpolate with object dtype."
)
in case np.any(obj.dtypes == object)
below and there is no need to raise Type Error if np.all(obj.dtypes == object)
. Should I remove raising
if np.all(obj.dtypes == object):
raise TypeError(
"Cannot interpolate with all object-dtype columns "
"in the DataFrame. Try setting at least one "
"column to a numeric dtype."
)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch! Yeah try that out
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the messaging will need to probably change, this could be a Series
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, I don't understand why is it a Series
? I thought it's a DataFrame
, because we are in the block
if obj.ndim == 2:
I suggest to use the message f"{type(self).__name__} cannot interpolate with object dtype."
, the same one we use in the block below
elif np.any(obj.dtypes == object):
raise TypeError(
f"{type(self).__name__} cannot interpolate with object dtype."
)
I think we can just remove the block with the old message:
if obj.ndim == 2:
if np.all(obj.dtypes == object):
...
and change the old message in tests to the new one "DataFrame cannot interpolate with object dtype"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah sorry I had misunderstood your initial suggestion.
You can also go a step further and replace All of these checks with
if np.any(obj.dtypes == object):
raise TypeError(f"{type(self).__name__} cannot interpolate with object dtype.")
Since Series.dtypes
and DataFrame.dtypes
can be checked at the same time
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, my wording wasn't clear.
My idea was to remove the second part of the error message:
"Cannot interpolate with all object-dtype columns in the DataFrame. Try setting at least one column to a numeric dtype."
"Try setting at least one column to a numeric dtype"
won't work from now on because of the enforcing in this PR. Then I thought why not to replace the first part: "Cannot interpolate with all object-dtype columns in the DataFrame."
with a new msg: "... cannot interpolate with object dtype."
If we do this we can remove one check block.
I made changes. Could you please take a look at it?
Thanks @natmokval |
FutureWarning, | ||
stacklevel=find_stack_level(), | ||
) | ||
raise TypeError( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest doing this check inside Block.interpolate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's a TODO(3.0) comment related to this inside Block.interpolate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you, I found the comment. In #58083 I moved raising TypeError from NDFrame
to Block
.
I think maybe it's better to change the error message from "NumpyBlock cannot interpolate with object dtype."
to "Can not interpolate with object dtype."
?
…ev#57820) * remove interpolate with object dtype * enforce deprecation interpolate with object dtype, correct tests * fix ruff error * add a note to v3.0.0 * combine two conditions * change blocks of if statements to avoid duplicate checks * replace err msg containing 'Try setting at least one column to a numeric dtype' * simplify if condition
xref #53638
enforced deprecation of
interpolate
with object dtype