Skip to content

Unpin mypy #33274

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 13 commits into from
Closed

Unpin mypy #33274

wants to merge 13 commits into from

Conversation

WillAyd
Copy link
Member

@WillAyd WillAyd commented Apr 3, 2020

Right now we have mypy 0.73 pinned in CI. However, newer versions will yield a lot of warnings like this:

pandas/core/groupby/base.py:32: error: "GroupByMixin" has no attribute "obj"
pandas/core/groupby/base.py:36: error: "GroupByMixin" has no attribute "_attributes"
pandas/core/groupby/base.py:40: error: "GroupByMixin" has no attribute "_groupby"
pandas/core/groupby/base.py:42: error: "GroupByMixin" has no attribute "_groupby"
pandas/core/groupby/base.py:44: error: Too many arguments for "GroupByMixin"
pandas/core/groupby/base.py:44: error: Unexpected keyword argument "groupby" for "GroupByMixin"
pandas/core/groupby/base.py:44: error: Unexpected keyword argument "parent" for "GroupByMixin"
pandas/core/groupby/base.py:45: error: "GroupByMixin" has no attribute "_reset_cache"

Most mixins we have will be problematic, if not all. Fortunately newer versions of mypy offer support for Protocols as a "self type", which were officially introduced in Python 3.8

This draft is an attempt at clarifying how that would work and seeing if it makes sense for the group. Depending on feedback, we may want to consider some kind of compat for this model to get it to work across Py36 and Py37 so our mypy doesn't lag too far behind

More info on the mypy release notes:

https://mypy-lang.blogspot.com/2019/11/mypy-0.html

@simonjayhawkins

@pep8speaks
Copy link

pep8speaks commented Apr 3, 2020

Hello @WillAyd! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 1095:5: E704 multiple statements on one line (def)

Line 1060:5: E704 multiple statements on one line (def)
Line 1063:5: E704 multiple statements on one line (def)
Line 1066:5: E704 multiple statements on one line (def)

Line 219:5: E704 multiple statements on one line (def)
Line 273:5: E704 multiple statements on one line (def)
Line 276:5: E704 multiple statements on one line (def)
Line 279:5: E704 multiple statements on one line (def)
Line 281:5: E704 multiple statements on one line (def)
Line 283:5: E704 multiple statements on one line (def)
Line 285:5: E704 multiple statements on one line (def)
Line 287:5: E704 multiple statements on one line (def)
Line 289:5: E704 multiple statements on one line (def)
Line 291:5: E704 multiple statements on one line (def)
Line 470:5: E704 multiple statements on one line (def)
Line 472:5: E704 multiple statements on one line (def)
Line 475:5: E704 multiple statements on one line (def)
Line 478:5: E704 multiple statements on one line (def)
Line 481:5: E704 multiple statements on one line (def)
Line 484:5: E704 multiple statements on one line (def)
Line 487:5: E704 multiple statements on one line (def)
Line 489:5: E704 multiple statements on one line (def)
Line 491:5: E704 multiple statements on one line (def)
Line 493:5: E704 multiple statements on one line (def)
Line 495:5: E704 multiple statements on one line (def)
Line 497:5: E704 multiple statements on one line (def)
Line 499:5: E704 multiple statements on one line (def)
Line 501:5: E704 multiple statements on one line (def)
Line 503:5: E704 multiple statements on one line (def)
Line 505:5: E704 multiple statements on one line (def)
Line 507:5: E704 multiple statements on one line (def)
Line 509:5: E704 multiple statements on one line (def)
Line 511:5: E704 multiple statements on one line (def)
Line 513:5: E704 multiple statements on one line (def)
Line 515:5: E704 multiple statements on one line (def)
Line 517:5: E704 multiple statements on one line (def)
Line 517:48: W291 trailing whitespace
Line 519:5: E704 multiple statements on one line (def)
Line 521:5: E704 multiple statements on one line (def)
Line 523:5: E704 multiple statements on one line (def)
Line 588:89: E501 line too long (94 > 88 characters)
Line 811:89: E501 line too long (90 > 88 characters)
Line 1030:89: E501 line too long (100 > 88 characters)

Line 22:5: E704 multiple statements on one line (def)
Line 22:89: E501 line too long (105 > 88 characters)
Line 25:5: E704 multiple statements on one line (def)
Line 28:5: E704 multiple statements on one line (def)
Line 31:5: E704 multiple statements on one line (def)
Line 33:5: E704 multiple statements on one line (def)
Line 34:1: W293 blank line contains whitespace

Comment last updated at 2020-04-09 22:26:53 UTC

@WillAyd WillAyd added the Typing type annotations, mypy/pyright type checking label Apr 3, 2020
@jbrockmendel
Copy link
Member

could define in pandas._typing:

if PY38:
    from typing import Protocol
else:
    class Protocol:
         pass

I think I tried this with typing.final with mixed success

@WillAyd
Copy link
Member Author

WillAyd commented Apr 6, 2020

closing for now; hopefully food for thought in the future

@WillAyd WillAyd closed this Apr 6, 2020
@WillAyd WillAyd reopened this Apr 7, 2020
@WillAyd WillAyd changed the title Protocol for GroupByMixin Unpin mypy Apr 9, 2020
@WillAyd
Copy link
Member Author

WillAyd commented Apr 9, 2020

I've repurposed this PR to try and unpin mypy 0.73 because the newer versions have a lot of great features that will work much better with the mix-ins that we have.

Unfortunately going to be a huge PR (in current state doens't work, actually causes more errors than before). Defining a protocol and checking for adherence on the DatetimeLike arena is where the difficulty lies

I'll post back when I have something reviewable

@simonjayhawkins
Copy link
Member

@WillAyd I think bumping mypy could(should) be kept seperate to introducing Protocols for Mixins. see #29493 (comment)

@simonjayhawkins
Copy link
Member

bumping to mypy 0.740 is probably the most difficult bump, see #28339 (comment) and #28339 (comment)

in #28339, I originally started to address some of the 0.740 issues (before reverting back to 0.730), so there is a small overlap with some of the changes here.

I will reopen #28339 temporarily and resync with my local branch. (for reference/comparison)

@WillAyd
Copy link
Member Author

WillAyd commented Apr 10, 2020

@WillAyd I think bumping mypy could(should) be kept seperate to introducing Protocols for Mixins. see #29493 (comment)

Yea I was hoping this at first, but when looking at it I wasn't really sure of a way to solve all of the mix-in patterns we have and the mypy errors they throw without the Protocol.

Maybe overlooked in what you linked but did you have alternate strategies in mind? Only other one I could think of is just ignoring errors in the datetimelike file where they occur, as that is probably more than half of the upgrade issues

@simonjayhawkins
Copy link
Member

not looked at the specific errors for a while, and now we have more functions typed, it probably won't no longer be as simple as updating the check_untyped_defs list.

not used protocols, but my understanding of PEP 544 and looking at typeshed. my initial reaction is that the protocols added here are not in the spirit of protocols, but that's a subjective opinion.

the issue of Mixins and the use of Protocols in self types, came after PEP544 and the introduction of automatically assigning the self type to self in 0.740, so I appreciate that this issue needs to be addressed.

to help keep the code clean, we should probably put protocols in pandas._typing.

There are other uses of protocol such as user facing apis and using @runtime_checkable to potentially replace so is_* functions. Since there is also NoReturn, Literal and TypedDict, it may be worth a seperate PR for vendoring typing extensions. May also be better not to put in pandas/ so that our lint checks do not apply.

and bumping mypy is probably long overdue. so anything to keep the PR manageable welcome.

@simonjayhawkins
Copy link
Member

just in case you want to update the list I use

mypy pandas --check-untyped-defs|grep -oP "^.?(?=:)"|sort|uniq|sort|grep -oP "^.?(?=.)"|sed 's/\/./g'|sed 's/^/[mypy-/'|sed 's/$/]\ncheck_untyped_defs=False\n/'

not elegant but works with the exception of clipboard which needs the __init__ removed. Need to manually remove the check_untyped_defs overrides in setup.py first.

@WillAyd
Copy link
Member Author

WillAyd commented Apr 10, 2020

not used protocols, but my understanding of PEP 544 and looking at typeshed. my initial reaction is that the protocols added here are not in the spirit of protocols, but that's a subjective opinion.

Can you clarify that more? FWIW I think ideally you would define the protocol first and keep it tight so subclasses can adhere to it. But we are tackling this from the other way where things are already implemented so have to define the Protocol that fits the requirements in place. Makes the protocols more verbose than you would want for sure.

Not sure if that's the issue you are describing or if you have something else in mind

@simonjayhawkins
Copy link
Member

That sums it up, but I agree that the Mixin issue is a different use case.

@simonjayhawkins simonjayhawkins mentioned this pull request Apr 13, 2020
@jbrockmendel
Copy link
Member

is this actionable?

@simonjayhawkins
Copy link
Member

is this actionable?

we could deal with the Mixin issue now using a pattern described in python/mypy#5837 (comment) and not actually call it a Protocol at this stage.

SomeRealClass would be the same as Protocol here, and the use of TYPE_CHECKING would make the base class only available when checking and leave the runtime behaviour unchanged. (i.e what Protocol is doing( without @runtime_checkable))

so adopting Protocol does not prevent unpinning mypy. These issues are orthogonal.

@WillAyd
Copy link
Member Author

WillAyd commented Apr 22, 2020

Closing again to clear queue. This is going to be tricky; probably a few ways to break up we can continue discussing here or in dedicated issues

@WillAyd WillAyd closed this Apr 22, 2020
@simonjayhawkins
Copy link
Member

we could deal with the Mixin issue now using a pattern described in python/mypy#5837 (comment) and not actually call it a Protocol at this stage.

FYI this pattern fails with Generic types, whereas Protocol could be generic.

This may be irrelevant to the decision at this stage if we are reluctant to use the Generic abc. The same discussion will probably arise for Protocol since it adds a base class at runtime as well and in some ways is not so different from Generic.

the protocols in this PR are not generic, so the pattern could be used.

@WillAyd WillAyd deleted the groupby-mixin-protocol branch April 12, 2023 20:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Typing type annotations, mypy/pyright type checking
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants