-
-
Notifications
You must be signed in to change notification settings - Fork 141
Support for pyright 1.1.310 #716
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Interval[Timestamp], | ||
Interval[Timedelta], | ||
CategoricalDtype, | ||
bound=str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See point #3 at microsoft/pyright#5178 (comment) , Recommended to use a Union
rather than a constrained type var.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this also affect other TypeVars (could be done later)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use S1 = TypeVar("S1", bound=Scalar)
? You replaced S1
with Scalar
in some places
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
S1 = TypeVar(
"S1",
bound=Scalar
| Interval[int]
| Interval[float]
| Interval[Timestamp]
| Interval[Timedelta]
| Dtype
| Period
| CategoricalDtype
| datetime.time,
)
Seems to work (but Scalar includes a few more types!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this also affect other TypeVars (could be done later)?
Yes, and I created an issue for that #719
name: Hashable | None = ..., | ||
copy: bool = ..., | ||
fastpath: bool = ..., | ||
) -> Self: ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This overload was previously connected to the next one in terms of the data
argument, but I split it into two to have it return Self
when S1
is known, and to return Series
if not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the two Self overloads be combined?
edit: Should we relax S1
as we explicitly allow object
(which is correct)?
edit2: The following replaces the last three overloads and passes all tests with pyright (but not mypy):
@overload
def __new__(
cls,
data: Series[S1] | dict[int, S1] | dict[_str, S1] | object= ...,
index: Axes | None = ...,
dtype: type[S1] | type | str | None=...,
name: Hashable = ...,
copy: bool = ...,
fastpath: bool = ...,
) -> Self: ...
About mypy errors: since they now have more frequent releases, I wouldn't mind making changes that break mypy-compatibility IF 1) there is a long-standing mypy issue, 2) after informing them that we will break compatibility, and 3) if it helps clean pandas-stubs up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the two Self overloads be combined?
The reason I separated them is that we want to allow people to have an untyped data
and specify dtype = type[S1]
, OR specify a typed data
(where S1
can be inferred) and not specify dtype
. If we combine them, then it allows any object
along with dtype: str
and that is an untyped Series
. That's where mypy
has trouble.
edit: Should we relax
S1
as we explicitly allowobject
(which is correct)?
No. I want to make it that if we know the type and can do something special with it (e.g., Series[Timestamp].__add__(Timedelta)
) we differentiate that. object
is too wide, so that will correspond to Series[Any]
(or in pyright terms, Series[Unknown]
)
edit2: The following replaces the last three overloads and passes all tests with pyright (but not mypy):
@overload def __new__( cls, data: Series[S1] | dict[int, S1] | dict[_str, S1] | object= ..., index: Axes | None = ..., dtype: type[S1] | type | str | None=..., name: Hashable = ..., copy: bool = ..., fastpath: bool = ..., ) -> Self: ...
See above reason why I don't want to combine them.
The mypy
error is reported here: python/mypy#15322
About mypy errors: since they now have more frequent releases, I wouldn't mind making changes that break mypy-compatibility IF 1) there is a long-standing mypy issue, 2) after informing them that we will break compatibility, and 3) if it helps clean pandas-stubs up.
I'm more hesitant to do that. Most of our bug reports from the community are things people find by using mypy
. It is the dominant type checker for now, so we should make sure we handle people's code with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
edit: Should we relax
S1
as we explicitly allowobject
(which is correct)?No. I want to make it that if we know the type and can do something special with it (e.g.,
Series[Timestamp].__add__(Timedelta)
) we differentiate that.object
is too wide, so that will correspond toSeries[Any]
(or in pyright terms,Series[Unknown]
)
I didn't mean to use data: object
but to simply widen S1 to S1 = TypeVar("S1")
, which allows any "object". Wouldn't this still allow special overloads? In either way, this PR is targeted for pyright compatibility and shouldn't depend on that.
def __add__(self, other: TimedeltaSeries | np.timedelta64) -> TimestampSeries: ... # type: ignore[override] | ||
def __radd__(self, other: TimedeltaSeries | np.timedelta64) -> TimestampSeries: ... # type: ignore[override] | ||
def __add__(self, other: TimedeltaSeries | np.timedelta64 | timedelta) -> TimestampSeries: ... # type: ignore[override] | ||
def __radd__(self, other: TimedeltaSeries | np.timedelta64 | timedelta) -> TimestampSeries: ... # type: ignore[override] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apparently pyright 1.1.310 had bugs that revealed we were missing some types here
@@ -1604,12 +1605,14 @@ def test_pandera_generic() -> None: | |||
T = TypeVar("T") | |||
|
|||
class MySeries(pd.Series, Generic[T]): | |||
... | |||
def __new__(cls, *args, **kwargs) -> Self: | |||
return object.__new__(cls) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes pyright
not complain.
|
||
def func() -> MySeries[float]: | ||
return MySeries[float]([1, 2, 3]) | ||
|
||
func() | ||
result = func() | ||
assert result.iloc[1] == 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I modified the test to make sure that defining __new__()
didn't change the result.
Thanks @Dr-Irv ! |
See discussion in microsoft/pyright#5178
Recommendation was to return
Self
instead ofSeries[S1]
. But if you do that when we don't know the type, it breaksmypy
. So if we don't know the type of theSeries
, we returnSeries
and that only broke a test related to pandera sublcassing, but only forpyright
, so I modified that test to makepyright
happy.