Skip to content

DEPR: deprecate Index.holds_integer #50243

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Jan 15, 2023

Conversation

topper-123
Copy link
Contributor

"""
Whether the type is an integer type.
"""
return lib.infer_dtype(value) in ["integer", "mixed-integer"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@topper-123 relying on this at all for internal purposes is what i dislike. whether its a method vs function is totally irrelevant

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, can you explan what's the issue with this approach? This method was only used in the plotting module, so a very little used function? Should we replace it there with normal dtype checking, i.e. just drop the "mixed-integer" checks?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is "values-dependent behavior" which in general we are trying to move away from. it makes it much harder to reason about code. one of the reasons this is little-used is i have been trying to remove usages of it.

i expect most of the relevant cases could just be is_integer_dtype(self.dtype). there would likely be some straggler object-dtype-of-integer cases that would need some deprecation cycle

@topper-123
Copy link
Contributor Author

topper-123 commented Dec 14, 2022

Ok, I understand. I'll look into it tomorrow. I've never worked on this part of the code base, hopefully it's not too complex.

elif is_object_dtype(self):
return self.inferred_type not in ["integer", "mixed-integer"]
else:
return True
Copy link
Contributor Author

@topper-123 topper-123 Dec 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to just use return self.inferred_type not in ["integer", "mixed-integer"] here instead of the multiliner?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like all our tests pass if we change this whole thing to return self.dtype.kind in ["i", "u"]. I think the thing to do is drop this for now and deprecate it in 2.x and change it to this simpler check in 3.0.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I follow how could deprecate this, as this is an internal attribute, and I don't think this changes behavior (though I'm not 100 % sure there isn't an edge case).

Are you saying only use return self.inferred_type not in ["integer", "mixed-integer"] here and drop the if/elif/else?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

though I'm not 100 % sure there isn't an edge case

Under my proposal, pd.Index([1, 2], dtype=object).holds_integer() would change in the future.

The only places where we use holds_integer are in _should_fallback_to_positional (which I expect to get rid of in 3.0) and plotting code. It would require some actual effort to determine if the suggested change could affect users via the plotting code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concretely in this PR, I git difficulty understanding what you want. Can you say what code you want changed, e.g. should the proposed deprecation of holds_integer be dropped? IMO it does seems like a good idea to deprecate the method...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in this PR i would suggest

def _holds_integer(self) -> bool:
     return self.inferred_type not in ["integer", "mixed-integer"] 

def holds_integer(self) -> bool:
    warngs.warn("holds_integer is deprecated [...]", FutureWarning)
    return self._holds_integer()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, nop problem, I've just updated the PR.

@topper-123
Copy link
Contributor Author

@jbrockmendel , any comment?

@@ -939,15 +939,15 @@ def __call__(self, *args, **kwargs):
f"{kind} requires either y column or 'subplots=True'"
)
if y is not None:
if is_integer(y) and not data.columns.holds_integer():
if is_integer(y) and data.columns._should_fallback_to_positional:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i dont think these are exactly the same. _should_fallback_to_positional is overriden by Index subclasses

Copy link
Contributor Author

@topper-123 topper-123 Dec 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, my instinct is that the old code easily give wrong result by giving an integer to an index that has _should_fallback_to_positional set to False and it should instead have been crystal clear if x/y are meant for .loc/.iloc instead of this in thiscode section. But this is difficult to prove and I'm not very familiar with the plotting code, so I've changed to use _holds_integer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i agree the current code is sketchy. hopefully we'll figure out what the original intent was and can change it to something more like is_integer_dtype(data.columns.dtype)

@topper-123
Copy link
Contributor Author

The ARM failure looks unrelated.

@topper-123 topper-123 force-pushed the deprecate_Index.holds_integer branch from 7260c7b to 107f8f4 Compare December 23, 2022 00:20
@mroeschke mroeschke added the Deprecate Functionality to remove in pandas label Dec 27, 2022
@topper-123 topper-123 force-pushed the deprecate_Index.holds_integer branch from 107f8f4 to 7ed0731 Compare December 28, 2022 00:45
@topper-123
Copy link
Contributor Author

I've changed this to use Index._holds_integer.

@topper-123
Copy link
Contributor Author

Ping.

Copy link
Member

@jbrockmendel jbrockmendel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Holding off on merging until the existing deprecations are all enforced for 2.0.

@topper-123
Copy link
Contributor Author

👍 .

@topper-123 topper-123 added this to the 2.0 milestone Jan 3, 2023
@topper-123 topper-123 added the Index Related to the Index class or subclasses label Jan 3, 2023
@topper-123 topper-123 force-pushed the deprecate_Index.holds_integer branch from 6aeefe5 to ca3c9f3 Compare January 14, 2023 11:43
@topper-123
Copy link
Contributor Author

I've updated this PR after it was decided that it is ok now to merge deprecation PRs.

The code checks errors are unrelated to this PR.

@topper-123
Copy link
Contributor Author

I’ve rerun the failed tests and everything is green now.

@phofl phofl merged commit d3f0e9a into pandas-dev:main Jan 15, 2023
@phofl
Copy link
Member

phofl commented Jan 15, 2023

thx @topper-123

@topper-123 topper-123 deleted the deprecate_Index.holds_integer branch January 15, 2023 15:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Deprecate Functionality to remove in pandas Index Related to the Index class or subclasses
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants