TYP: rename #46428

twoertwein · 2022-03-19T02:59:17Z

tested with mypy and pyright:

import pandas as pd

def bool_() -> bool:
    ...

df = pd.DataFrame({"a": []})
ser = pd.Series({"a": 1})

# DataFrame/Series
reveal_type(df.rename(columns={"a": "b"}))
reveal_type(df.rename(columns={"a": "b"}, inplace=False))
reveal_type(ser.rename("a"))
reveal_type(ser.rename("a", inplace=False))

# None
reveal_type(df.rename(columns={"a": "b"}, inplace=True))
reveal_type(ser.rename("a", inplace=True))

# both
reveal_type(df.rename(columns={"a": "b"}, inplace=bool_()))
reveal_type(ser.rename("a", inplace=bool_()))

pep8speaks · 2022-03-19T02:59:22Z

Hello @twoertwein! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2022-04-01 12:11:24 UTC

twoertwein · 2022-03-19T03:03:39Z

Marking as draft as #46423 should be merged first. Then this PR can add a few more IgnoreRaise to rename.

Dr-Irv

Just one small question. Otherwise, looks good.

pandas/core/series.py

twoertwein · 2022-03-23T15:28:18Z

Thank you @Dr-Irv for reviewing my PRs :) If you think I can provide feedback for your PRs here or at MS, feel free to ping me.

I slowly try to get the errors from pyright's reportGeneralTypeIssues to manageable amount to enable it at some point (#44855). Many (easier) to fix errors are caused by missing overloads (mypy assumes Any but pyright inspects all possible return values which then creates error).

pandas/core/series.py

pandas/core/frame.py

twoertwein · 2022-03-30T02:51:36Z

pandas/core/frame.py

-        index: Renamer | None = ...,
-        columns: Renamer | None = ...,
+        index: Renamer[HashableTa] | None = ...,
+        columns: Renamer[HashableTb] | None = ...,


ugly: need different TypeVars so that they are not tied to each other

maybe add the comment here as well

specifying mapper and index/columns raise a TypeError.. TypeError: Cannot specify both 'mapper' and any of 'index' or 'columns'

As this is by definition a TypeError, we could catch this with a type checker by adding more overloads.

would presumably only need two TypeVars?

As this is by definition a TypeError, we could catch this with a type checker by adding more overloads.

This would double the number of overloads. I wouldn't mind skipping this (for now).

pandas/core/generic.py

pandas/io/json/_normalize.py

pandas/_typing.py

Dr-Irv

It's ugly, but it works. See suggestions.

pandas/_typing.py

pandas/core/series.py

Dr-Irv · 2022-03-30T12:29:47Z

pandas/core/frame.py

-        index: Renamer | None = ...,
-        columns: Renamer | None = ...,
+        index: Renamer[HashableTa] | None = ...,
+        columns: Renamer[HashableTb] | None = ...,


maybe add the comment here as well

simonjayhawkins · 2022-03-31T15:04:05Z

I think I found a nicer solution that works with the latest pyright+mypy and is sanctions by one of the typing gods:

hmm, this is really the issue here, that Mapping is invariant in its key type, xref python/mypy#12464 (comment)

the response was to use Any as the key type as a workaround. While this is probably fine for stubs since a dict/mapping can only be created at runtime with hashable keys, this workaround does not help with the type checking of the function itself, as we would not know the type of the key inside the function. (Using Any also prevents us turning on some of the strict flags down the line.)

I fear that due to other questions in that thread that we didn't get a resolution on this one.

So while the solution here using the TypeVars seems to satisfy the requirements of both users checking their own code and checking the internals of pandas itself, it still feels a bit kludgy.

@twoertwein in you discussions with mypy/pyright regarding this did you come across other projects having this issue? Is there a precendent for using TypeVars as a workaround for Mapping being invariant in its key type?

simonjayhawkins · 2022-03-31T15:10:00Z

could we maybe use Any in the overloads (the public API) and Hashable in the signature of function itself (internal code checking)?

Dr-Irv · 2022-03-31T15:27:12Z

So while the solution here using the TypeVars seems to satisfy the requirements of both users checking their own code and checking the internals of pandas itself, it still feels a bit kludgy.

I agree. IMHO, if we want to use a type for the keys of Dict or Mapping, it can only be a base type (e.g., str, int) or we use Any. We can't use a Union (e.g., Dict[int|str, str] won't work). I don't think we can use Hashable, because essentially Hashable is a Union (although declared as a Protocol).

simonjayhawkins · 2022-03-31T15:44:32Z

or we use Any

since that was "official" response from the mypy side I think we should probably do that for now.

This does not satisfy internal code checking requirements, hence the suggestion above #46428 (comment)

twoertwein · 2022-03-31T19:07:17Z

I think I finally agree with Mapping[Any, Hashable] in case of rename as pandas will only ever call it with Hasables and it is guaranteed to get a Hashable in return. (If we had code that inverts keys and values, then that wouldn't be okay with me.)

@twoertwein in you discussions with mypy/pyright regarding this did you come across other projects having this issue? Is there a precendent for using TypeVars as a workaround for Mapping being invariant in its key type?

No other projects but two of my "issues" at pyright microsoft/pyright#2344 (comment) (pyright: explicit use case to get covariant behavior) and microsoft/pyright#3272 (no objections by pyright and mypy maintainers for implementing covariant keys with typevars - but also no approval)

The only issue I have with Mapping[Any, Hashable] is that the keys of Mapping are not required to be Hashable (and neither are the arguments to a Callable):

from typing import Any, Hashable, Mapping, TypeVar

HashableT = TypeVar("HashableT", bound=Hashable)


def get_mapper() -> Mapping[list[int], int]:
    ...


def any_rename(mapper: Mapping[Any, Hashable]) -> None:
    ...


def typevar_rename(mapper: Mapping[HashableT, Hashable]) -> None:
    ...


# allowed - Mapping does not require the keys to be Hashable (Dict does)
any_rename(get_mapper())

# error (both mypy and pyright)
typevar_rename(get_mapper())

But I fully agree that the TypeVar solution is quite verbose when we have functions where we need multiple TypeVars to express the same type constraint (on the otherhand it will also work with any non-generic type including int | str).

twoertwein · 2022-03-31T19:38:43Z

So while in the past we tried to be as precise as possible with the type annotations, i'm fine with less precise types in these situations (for now!)

I'll go with Any

Using Any also prevents us turning on some of the strict flags down the line.

I think we are unfortunately quite far from that point. I would prioritize py.typed over having no Anys (typeshed itself also has a few Anys).

twoertwein · 2022-03-31T20:25:21Z

My takeaways for the future:

Avoid TypeVar magic when it is getting too verbose
When the first argument of Mapping/Dict/List/Callable is Hashable and the type is used as an input argument, replacing Hashable with Any seems like a good solution
For types stricter than Hashable, TypeVar magic is a solution (maybe even the only solution) but it might get verbose

Dr-Irv

I pulled your latest version, and ran the following test:

# flake8: noqa: F841

from typing import List, Union, TYPE_CHECKING


import pandas as pd


def check_series_result(s: pd.Series):
    pass

def test_types_rename() -> None:
    # Scalar
    s1 = pd.Series([1, 2, 3]).rename("A")
    check_series_result(s1)
    # Hashable Sequence
    s2 = pd.Series([1, 2, 3]).rename(("A", "B"))
    check_series_result(s2)

    # Optional
    s3 = pd.Series([1, 2, 3]).rename(None)
    check_series_result(s3)

    # Functions
    def add1(x: int) -> int:
        return x + 1

    s4 = pd.Series([1, 2, 3]).rename(add1)
    check_series_result(s4)

    # Dictionary
    s5 = pd.Series([1, 2, 3]).rename({1: 10})
    check_series_result(s5)
    # inplace
    s6: None = pd.Series([1, 2, 3]).rename("A", inplace=True)

    if TYPE_CHECKING:
        s7 = pd.Series([1, 2, 3]).rename({1: [3, 4, 5]})  # type: ignore

mypy is thinking that all the results are Optional[Series], not Series (or None in the case of inplace), so there is an issue here.

twoertwein · 2022-03-31T23:17:38Z

I pulled your latest version, and ran the following test

Thank you for testing! When I run mypy and pyright (both on python 3.10), I do not get a single error (except for unused variables from pyright).

Dr-Irv · 2022-04-01T01:07:57Z

I pulled your latest version, and ran the following test

Thank you for testing! When I run mypy and pyright (both on python 3.10), I do not get a single error (except for unused variables from pyright).

My mistake. I had a typo when I pulled the PR. Test passes for me now.

pandas/core/series.py

jreback · 2022-04-03T03:20:45Z

thanks @twoertwein

twoertwein added the Typing type annotations, mypy/pyright type checking label Mar 19, 2022

twoertwein marked this pull request as draft March 19, 2022 03:02

twoertwein marked this pull request as ready for review March 22, 2022 00:31

twoertwein requested a review from Dr-Irv March 22, 2022 21:02

Dr-Irv reviewed Mar 22, 2022

View reviewed changes

pandas/core/series.py Outdated Show resolved Hide resolved

Dr-Irv approved these changes Mar 23, 2022

View reviewed changes

jreback added this to the 1.5 milestone Mar 25, 2022

Dr-Irv reviewed Mar 26, 2022

View reviewed changes

pandas/core/series.py Show resolved Hide resolved

Dr-Irv reviewed Mar 26, 2022

View reviewed changes

pandas/core/frame.py Show resolved Hide resolved

twoertwein mentioned this pull request Mar 27, 2022

TYP: Many typing constructs are invariant #46535

Closed

twoertwein commented Mar 30, 2022

View reviewed changes

pandas/core/generic.py Outdated Show resolved Hide resolved

twoertwein commented Mar 30, 2022

View reviewed changes

pandas/io/json/_normalize.py Outdated Show resolved Hide resolved

twoertwein commented Mar 30, 2022

View reviewed changes

pandas/_typing.py Show resolved Hide resolved

Dr-Irv requested changes Mar 30, 2022

View reviewed changes

twoertwein added 3 commits March 31, 2022 15:50

TYP: rename (TypeVar approach)

3220658

use Any

17a705b

flake8

4dda358

Dr-Irv requested changes Mar 31, 2022

View reviewed changes

Dr-Irv approved these changes Apr 1, 2022

View reviewed changes

update ignore comment

3314b91

simonjayhawkins reviewed Apr 1, 2022

View reviewed changes

pandas/core/series.py Show resolved Hide resolved

remove doc-string of NDFrame._rename

f8682f2

jreback merged commit 79fb2de into pandas-dev:main Apr 3, 2022

twoertwein deleted the rename branch May 26, 2022 01:59

yehoshuadimarsky pushed a commit to yehoshuadimarsky/pandas that referenced this pull request Jul 13, 2022

TYP: rename (pandas-dev#46428)

eb25a11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TYP: rename #46428

TYP: rename #46428

twoertwein commented Mar 19, 2022

pep8speaks commented Mar 19, 2022 •

edited

Loading

twoertwein commented Mar 19, 2022

Dr-Irv left a comment

twoertwein commented Mar 23, 2022

twoertwein Mar 30, 2022

Dr-Irv Mar 30, 2022

simonjayhawkins Mar 31, 2022

twoertwein Mar 31, 2022

Dr-Irv left a comment

Dr-Irv Mar 30, 2022

simonjayhawkins commented Mar 31, 2022

simonjayhawkins commented Mar 31, 2022 •

edited

Loading

Dr-Irv commented Mar 31, 2022

simonjayhawkins commented Mar 31, 2022

twoertwein commented Mar 31, 2022 •

edited

Loading

twoertwein commented Mar 31, 2022

twoertwein commented Mar 31, 2022

Dr-Irv left a comment

twoertwein commented Mar 31, 2022

Dr-Irv commented Apr 1, 2022

jreback commented Apr 3, 2022

TYP: rename #46428

TYP: rename #46428

Conversation

twoertwein commented Mar 19, 2022

pep8speaks commented Mar 19, 2022 • edited Loading

Comment last updated at 2022-04-01 12:11:24 UTC

twoertwein commented Mar 19, 2022

Dr-Irv left a comment

Choose a reason for hiding this comment

twoertwein commented Mar 23, 2022

twoertwein Mar 30, 2022

Choose a reason for hiding this comment

Dr-Irv Mar 30, 2022

Choose a reason for hiding this comment

simonjayhawkins Mar 31, 2022

Choose a reason for hiding this comment

twoertwein Mar 31, 2022

Choose a reason for hiding this comment

Dr-Irv left a comment

Choose a reason for hiding this comment

Dr-Irv Mar 30, 2022

Choose a reason for hiding this comment

simonjayhawkins commented Mar 31, 2022

simonjayhawkins commented Mar 31, 2022 • edited Loading

Dr-Irv commented Mar 31, 2022

simonjayhawkins commented Mar 31, 2022

twoertwein commented Mar 31, 2022 • edited Loading

twoertwein commented Mar 31, 2022

twoertwein commented Mar 31, 2022

Dr-Irv left a comment

Choose a reason for hiding this comment

twoertwein commented Mar 31, 2022

Dr-Irv commented Apr 1, 2022

jreback commented Apr 3, 2022

pep8speaks commented Mar 19, 2022 •

edited

Loading

simonjayhawkins commented Mar 31, 2022 •

edited

Loading

twoertwein commented Mar 31, 2022 •

edited

Loading