-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
DEPR: deprecate Index.holds_integer #50243
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DEPR: deprecate Index.holds_integer #50243
Conversation
pandas/core/dtypes/common.py
Outdated
""" | ||
Whether the type is an integer type. | ||
""" | ||
return lib.infer_dtype(value) in ["integer", "mixed-integer"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@topper-123 relying on this at all for internal purposes is what i dislike. whether its a method vs function is totally irrelevant
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, can you explan what's the issue with this approach? This method was only used in the plotting module, so a very little used function? Should we replace it there with normal dtype checking, i.e. just drop the "mixed-integer" checks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is "values-dependent behavior" which in general we are trying to move away from. it makes it much harder to reason about code. one of the reasons this is little-used is i have been trying to remove usages of it.
i expect most of the relevant cases could just be is_integer_dtype(self.dtype)
. there would likely be some straggler object-dtype-of-integer cases that would need some deprecation cycle
Ok, I understand. I'll look into it tomorrow. I've never worked on this part of the code base, hopefully it's not too complex. |
pandas/core/indexes/base.py
Outdated
elif is_object_dtype(self): | ||
return self.inferred_type not in ["integer", "mixed-integer"] | ||
else: | ||
return True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to just use return self.inferred_type not in ["integer", "mixed-integer"]
here instead of the multiliner?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like all our tests pass if we change this whole thing to return self.dtype.kind in ["i", "u"]
. I think the thing to do is drop this for now and deprecate it in 2.x and change it to this simpler check in 3.0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I follow how could deprecate this, as this is an internal attribute, and I don't think this changes behavior (though I'm not 100 % sure there isn't an edge case).
Are you saying only use return self.inferred_type not in ["integer", "mixed-integer"]
here and drop the if/elif/else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
though I'm not 100 % sure there isn't an edge case
Under my proposal, pd.Index([1, 2], dtype=object).holds_integer()
would change in the future.
The only places where we use holds_integer are in _should_fallback_to_positional (which I expect to get rid of in 3.0) and plotting code. It would require some actual effort to determine if the suggested change could affect users via the plotting code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concretely in this PR, I git difficulty understanding what you want. Can you say what code you want changed, e.g. should the proposed deprecation of holds_integer
be dropped? IMO it does seems like a good idea to deprecate the method...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in this PR i would suggest
def _holds_integer(self) -> bool:
return self.inferred_type not in ["integer", "mixed-integer"]
def holds_integer(self) -> bool:
warngs.warn("holds_integer is deprecated [...]", FutureWarning)
return self._holds_integer()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, nop problem, I've just updated the PR.
@jbrockmendel , any comment? |
pandas/plotting/_core.py
Outdated
@@ -939,15 +939,15 @@ def __call__(self, *args, **kwargs): | |||
f"{kind} requires either y column or 'subplots=True'" | |||
) | |||
if y is not None: | |||
if is_integer(y) and not data.columns.holds_integer(): | |||
if is_integer(y) and data.columns._should_fallback_to_positional: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i dont think these are exactly the same. _should_fallback_to_positional is overriden by Index subclasses
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, my instinct is that the old code easily give wrong result by giving an integer to an index that has _should_fallback_to_positional
set to False and it should instead have been crystal clear if x/y are meant for .loc/.iloc instead of this in thiscode section. But this is difficult to prove and I'm not very familiar with the plotting code, so I've changed to use _holds_integer
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i agree the current code is sketchy. hopefully we'll figure out what the original intent was and can change it to something more like is_integer_dtype(data.columns.dtype)
The ARM failure looks unrelated. |
7260c7b
to
107f8f4
Compare
107f8f4
to
7ed0731
Compare
I've changed this to use |
Ping. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Holding off on merging until the existing deprecations are all enforced for 2.0.
👍 . |
6aeefe5
to
ca3c9f3
Compare
I've updated this PR after it was decided that it is ok now to merge deprecation PRs. The code checks errors are unrelated to this PR. |
I’ve rerun the failed tests and everything is green now. |
thx @topper-123 |
See #50042 (comment)