-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
TYP: rename #46428
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TYP: rename #46428
Conversation
Hello @twoertwein! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found: There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 Comment last updated at 2022-04-01 12:11:24 UTC |
Marking as draft as #46423 should be merged first. Then this PR can add a few more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one small question. Otherwise, looks good.
Thank you @Dr-Irv for reviewing my PRs :) If you think I can provide feedback for your PRs here or at MS, feel free to ping me. I slowly try to get the errors from pyright's |
pandas/core/frame.py
Outdated
index: Renamer | None = ..., | ||
columns: Renamer | None = ..., | ||
index: Renamer[HashableTa] | None = ..., | ||
columns: Renamer[HashableTb] | None = ..., |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ugly: need different TypeVars so that they are not tied to each other
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe add the comment here as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
specifying mapper and index/columns raise a TypeError.. TypeError: Cannot specify both 'mapper' and any of 'index' or 'columns'
As this is by definition a TypeError, we could catch this with a type checker by adding more overloads.
would presumably only need two TypeVars?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As this is by definition a TypeError, we could catch this with a type checker by adding more overloads.
This would double the number of overloads. I wouldn't mind skipping this (for now).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's ugly, but it works. See suggestions.
pandas/core/frame.py
Outdated
index: Renamer | None = ..., | ||
columns: Renamer | None = ..., | ||
index: Renamer[HashableTa] | None = ..., | ||
columns: Renamer[HashableTb] | None = ..., |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe add the comment here as well
hmm, this is really the issue here, that Mapping is invariant in its key type, xref python/mypy#12464 (comment) the response was to use Any as the key type as a workaround. While this is probably fine for stubs since a dict/mapping can only be created at runtime with hashable keys, this workaround does not help with the type checking of the function itself, as we would not know the type of the key inside the function. (Using I fear that due to other questions in that thread that we didn't get a resolution on this one. So while the solution here using the TypeVars seems to satisfy the requirements of both users checking their own code and checking the internals of pandas itself, it still feels a bit kludgy. @twoertwein in you discussions with mypy/pyright regarding this did you come across other projects having this issue? Is there a precendent for using TypeVars as a workaround for Mapping being invariant in its key type? |
could we maybe use Any in the overloads (the public API) and Hashable in the signature of function itself (internal code checking)? |
I agree. IMHO, if we want to use a type for the keys of |
since that was "official" response from the mypy side I think we should probably do that for now. This does not satisfy internal code checking requirements, hence the suggestion above #46428 (comment) |
I think I finally agree with
No other projects but two of my "issues" at pyright microsoft/pyright#2344 (comment) (pyright: explicit use case to get covariant behavior) and microsoft/pyright#3272 (no objections by pyright and mypy maintainers for implementing covariant keys with typevars - but also no approval) The only issue I have with from typing import Any, Hashable, Mapping, TypeVar
HashableT = TypeVar("HashableT", bound=Hashable)
def get_mapper() -> Mapping[list[int], int]:
...
def any_rename(mapper: Mapping[Any, Hashable]) -> None:
...
def typevar_rename(mapper: Mapping[HashableT, Hashable]) -> None:
...
# allowed - Mapping does not require the keys to be Hashable (Dict does)
any_rename(get_mapper())
# error (both mypy and pyright)
typevar_rename(get_mapper()) But I fully agree that the TypeVar solution is quite verbose when we have functions where we need multiple TypeVars to express the same type constraint (on the otherhand it will also work with any non-generic type including |
I'll go with
I think we are unfortunately quite far from that point. I would prioritize py.typed over having no |
My takeaways for the future:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pulled your latest version, and ran the following test:
# flake8: noqa: F841
from typing import List, Union, TYPE_CHECKING
import pandas as pd
def check_series_result(s: pd.Series):
pass
def test_types_rename() -> None:
# Scalar
s1 = pd.Series([1, 2, 3]).rename("A")
check_series_result(s1)
# Hashable Sequence
s2 = pd.Series([1, 2, 3]).rename(("A", "B"))
check_series_result(s2)
# Optional
s3 = pd.Series([1, 2, 3]).rename(None)
check_series_result(s3)
# Functions
def add1(x: int) -> int:
return x + 1
s4 = pd.Series([1, 2, 3]).rename(add1)
check_series_result(s4)
# Dictionary
s5 = pd.Series([1, 2, 3]).rename({1: 10})
check_series_result(s5)
# inplace
s6: None = pd.Series([1, 2, 3]).rename("A", inplace=True)
if TYPE_CHECKING:
s7 = pd.Series([1, 2, 3]).rename({1: [3, 4, 5]}) # type: ignore
mypy is thinking that all the results are Optional[Series]
, not Series
(or None
in the case of inplace
), so there is an issue here.
Thank you for testing! When I run mypy and pyright (both on python 3.10), I do not get a single error (except for unused variables from pyright). |
My mistake. I had a typo when I pulled the PR. Test passes for me now. |
thanks @twoertwein |
tested with mypy and pyright: