-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
TYP: fix reportInvalidTypeVarUse from pyright #42367
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Thanks @twoertwein for the PR. I have merged just merged #41955, so need to change A bit of history here... originally we just had the so some uses of I think mypy does pick up some cases where |
I should have replaced FrameOrSeries = TypeVar("FrameOrSeries", bound="NDFrame") If FrameOrSeries = TypeVar("FrameOrSeries", DataFrame, Series) |
originally, that was what was used. but that created issues with core.generic (and some of the groupby IIRC)
probably only in needed in core.generic and maybe groupby. elsewhere |
There is another subtle difference which maybe significant in some situations. |
Thanks for all the information! I will prefer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @twoertwein generally lgtm.
I think this PR is ready from my perspective (happy to remove the pyright config in pyproject.toml). Let me know what you think @simonjayhawkins |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @twoertwein comment about comments for ignores otherwise lgtm.
pandas/core/groupby/grouper.py
Outdated
@@ -297,7 +303,7 @@ def ax(self) -> Index: | |||
raise ValueError("_set_grouper must be called before ax is accessed") | |||
return index | |||
|
|||
def _get_grouper(self, obj: FrameOrSeries, validate: bool = True): | |||
def _get_grouper(self, obj: NDFrame, validate: bool = True): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect that this Typevar may be needed to maintain the return type, but the return is not typed, so I guess ok to use NDFrame for now to silence pyright.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right! I think the return type should be tuple[Any, ops.BaseGrouper, FrameOrSeries]
I'm not sure what self.binner
is supposed to be.
Mypy thinks that all three variables should be None
. So I would need to add an ignore comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I "partially" typed _get_grouper
def _get_grouper(
self, obj: FrameOrSeries, validate: bool = True
) -> tuple[Any, ops.BaseGrouper, FrameOrSeries]:
The Any
is not great, but it is definitely better than before.
@twoertwein can you avoid squashing and force pushing, it makes reviewing updates more difficult. |
This pull request is stale because it has been open for thirty days with no activity. Please update or respond to this comment if you're still interested in working on this. |
Rebased (but this time not squashed) |
@twoertwein whenever you have a second mind merging master |
@alimcmaster1 rebased. Let me know if I should remove some parts that might be controversial (pyright settings in pyproject.toml?). |
@twoertwein can you setup this? |
I will look into that. Should that be part of this PR or a followup (it might take me some time to play around with the CI). Where in the CI should pyright be integrated (I assume in "CI / Checks")? Should pyright also be part of the pre-commit? There doesn't seem to be a pre-commit interface from pyright, except this third-party wrapper: https://github.com/necaris/pre-commit-pyright. Other points to consider:
edit: pyright can apparently be easily integrated in pre-commit: https://github.com/microsoft/pyright/blob/main/docs/ci-integration.md |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be great to enable pyright on the CI. It takes only ~90sec to run pyright on the entire codebase with the following configuration in pyproject.toml (ignores most type errors for now):
@twoertwein can you setup this?
I will look into that. Should that be part of this PR or a followup (it might take me some time to play around with the CI).
Where in the CI should pyright be integrated (I assume in "CI / Checks")? Should pyright also be part of the pre-commit? There doesn't seem to be a pre-commit interface from pyright, except this third-party wrapper: https://github.com/necaris/pre-commit-pyright.
Other points to consider:
- Pyright depends on node.js
- Pyright fixes bugs at a rapid speed: it has ~1 release per week (probably need to pin the version)
- Mypy and pyright might disagree (I don't think this is a concern right now, since most type checking rules need to be disabled to pass pyright).
absolutely separate PR. in fact you might want to get the job in and can mark it failures_allowed for now, then PRs to fix up issues to get it to pass (and a todo list if needed).
- Pyright depends on node.js
prob need to add setup instructions in the docs then on how to run it
- Pyright fixes bugs at a rapid speed: it has ~1 release per week (probably need to pin the version)
sounds fine to pin
@twoertwein happy to merge this when ready |
I partially typed Apart for double-checking that, it should be ready from my side. |
Removed the pyright configuration (will be added in #43672). |
Rebased (includes #43672) and enabled checks for reportInvalidTypeVarUse and reportSelfClsParameterName. |
@@ -1337,7 +1337,7 @@ def _make_unique_kwarg_list( | |||
|
|||
|
|||
def relabel_result( | |||
result: FrameOrSeries, | |||
result: DataFrame | Series, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are we not able to use the alias, this is the entire point of it (i mean if we want to switch exclusively to DataFrame | Series
i guess its ok, but we have an alias on purpose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FrameOrSeries = TypeVar("FrameOrSeries", bound="NDFrame")
A TypeVar
needs to be bound which isn't the case here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be helpful to rename FrameOrSeries
to NDFrameT
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ahh ok, then lets do that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in this PR or a followup?
I don't mind adding more (I'm not the one reviewing it)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
follow-up ok. we have the naming for historical reasons and IIRC other reviewers have strong views on the naming, so IMO best not to mix in anything that could be controversial.
def shift( | ||
self: IntervalArrayT, periods: int = 1, fill_value: object = None | ||
) -> IntervalArray: | ||
def shift(self, periods: int = 1, fill_value: object = None) -> IntervalArray: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are these not valid?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here:
IntervalArrayT = TypeVar("IntervalArrayT", bound="IntervalArray")
if the return value was of type IntervalArrayT
, it would be valid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @twoertwein very nice cleanup!
we may want to try to replace more NDFrame usages outside of core (core.generic/core.groupby) where possible now that the invalid use of the inappropriately named FrameOrSeries typvar is caught by pyright.
This fixes the following warning from pyright:
TypeVar "..." appears only once in generic function signature (reportInvalidTypeVarUse)
.It would be great to enable pyright on the CI. It takes only ~90sec to run pyright on the entire codebase with the following configuration in pyproject.toml (ignores most type errors for now):
The following errors would need to be taken care of before then:
pandas/pandas/core/frame.py:4575:9 - error: Method "_reindex_axes" cannot override final method defined in class "NDFrame"
pandas/pandas/core/frame.py:10736:9 - error: Method "mask" cannot override final method defined in class "NDFrame"
pandas/pandas/core/series.py:5449:9 - error: Method "mask" cannot override final method defined in class "NDFrame"
pandas/pandas/core/computation/expr.py:268:26 - error: TypeVar bound type cannot be generic
pandas/pandas/core/groupby/generic.py:676:9 - error: Method "describe" cannot override final method defined in class "GroupBy"
pandas/pandas/core/groupby/generic.py:856:9 - error: Method "pct_change" cannot override final method defined in class "GroupBy"