Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
TYP: try out TypeGuard #51309
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TYP: try out TypeGuard #51309
Changes from all commits
76a998c
e12b7a7
1b3ef00
a1d1517
024ed70
0e4d37a
6be7228
b4ee975
e683d19
64e398e
2940b01
35746ac
f8248e5
1d5f3e4
3e0c4a0
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kind of dislike having composite types inside the type guard (because it probably gives some uglier code in other locations). Maybe only use
TypeGuard
when it gives a single type?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i get where youre coming from, but i also have an urge for "maximum accuracy" here. im going to keep it how it is for now but will be OK with changing if consensus develops
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would keep the Union (but it would be nicer without it). I think not having the Unions might cause some incorrect unreachable code warnings from mypy/pyright.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To some extent, this change is revealing an issue throughout the pandas code.
Sometimes, when we call
is_integer(value)
oris_bool(value)
, the intent is to check whether the value is an integer or numpy integer, or bool or numpy boolean. In other cases, we actually want to check that the value is NOT a numpy type (although I'm not 100% sure of that).If it is the case that we accept a numpy integer/bool whenever we have typed an argument as
int
orbool
, respectively, then the types used within the pandas functions should change to reflect that.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if we want to check that it is a python integer, we'd just do
isinstance(x, int)
. My issue was more that by having a union type here, we'd have to have this union type in a lot of places, which doesn't feel like very clean code. My suggestion was to not typeis_integer
or other with union return types.But ok, it was just a preference, I can also accept it, as overall this PR is a nice improvement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or for every place that such a change is necessary, it makes you analyze the code and determine whether you really want to change the test to
isinstance(x, int)
or modify other parts of the code knowing that anp.integer
is valid.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd import from standard lib if on python >= 3.10, else from
typing_extensions
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to take into account that
typing_extensions
may not be installed. All in all I'd probably do this something like this:Also add
typing_extensions
torequirements-dev.txt
, I think it's possible to add it liketyping_extensions; python_version < '3.10'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does that work in an environment.yml file? requirements-dev.txt is auto-generated from the environment.yml.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
checking for py310 requires a circular import which isnt great. im leaning towards abandoning this PR, would be happy if you'd like to get it across the finish line
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I didn't know it is autogenerated and I not knowledgeable with environment.yml to know if it`s possible there. I could take a look. I can also try take over, no problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TypeGuard = bool
? In my understanding aTypeGuard
is like a bool that ensure that another has some type…There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably not be outside of the
if TYPE_CHECKING
section?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could remove
# type: ignore
by changing signature:There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's not good to change type signatures to keep the type checkers happy, mostly because it could easily spread to be needed everywhere, which would be a PITA.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But that's part of the point of type checking! The fact that you had to do a
# type: ignore
is a signal that the code is inconsistent in terms of the expected arguments. In this particular case, the function is called only once, so changing the type in the function signature won't matter. Not sure of the other cases that I noted elsewhere in the PR.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could also add this file to the long list here https://github.com/pandas-dev/pandas/blob/main/pyright_reportGeneralTypeIssues.json#L17
But I wouldn't mind start using line-by-line ignores. I think we started with file-based ignores as we have in most cases just too many errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Things like this will show up many place where we use
is_integer
(after this PR), so IMO it file based exclusions will not work because every time we useis_integer
in new files, these typing errors may show up. So using file based ignores will quickly make using pyright less meaningful.So I think we will have to either:
TypeGuard
)TypeGuard
(minimizes issues when using functions that implementTypeGuard)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can get rid of the error by changing the sig of the function:
In this case, the type checker was telling you that
np.bool_
is a valid type for thevalue
argument.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change the sig here too:
Note - if you really don't want to allow
np.integer
, then change the test inside to not useis_integer()
.