Skip to content

Fix: Treat Generic classes as not being is_list_like #49736

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Dec 13, 2022

Conversation

Dr-Irv
Copy link
Contributor

@Dr-Irv Dr-Irv commented Nov 16, 2022

@Dr-Irv Dr-Irv mentioned this pull request Nov 16, 2022
@mroeschke mroeschke added Typing type annotations, mypy/pyright type checking Python 3.11 labels Nov 17, 2022
@WillAyd
Copy link
Member

WillAyd commented Nov 17, 2022

Isn't python 3.11 exhibiting the correct behavior?

@Dr-Irv
Copy link
Contributor Author

Dr-Irv commented Nov 17, 2022

Isn't python 3.11 exhibiting the correct behavior?

No, see the example in #49649 . It is preventing subclassing DataFrame as a Generic.

I think what happened here is that python 3.11 added an __iter__() method to the class representing the type of the Generic, so we see that type as being "list like". Then internally, the typing module assigns an attribute to the instance, which we then complain about. Prior to 3.11, the __iter__() method wasn't there, so we were OK.

@WillAyd
Copy link
Member

WillAyd commented Nov 17, 2022

Ah OK thanks for the explanation. If you replaced Gen[T] in your test with list[T] what would your expectation be? I think we should add that as a test as well

@Dr-Irv
Copy link
Contributor Author

Dr-Irv commented Nov 17, 2022

Ah OK thanks for the explanation. If you replaced Gen[T] in your test with list[T] what would your expectation be? I think we should add that as a test as well

Before seeing that message, I updated the test to more explicitly test the case of subclassing a DataFrame . So I'm not sure if we need to test list[T] .

The issue is that when you instantiate the subclass that is Generic, the typing module creates an attribute on the class and we issue a warning:

C:\Anaconda3\envs\pandas-dev311\Lib\typing.py:1253: UserWarning: Pandas doesn't allow columns to be created via a new attribute name - see https://pandas.pydata.org/pandas-docs/stable/indexing.html#attribute-access
  result.__orig_class__ = self

Here result is a DataFrame, and self is the _GenericAlias .

Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@jbrockmendel
Copy link
Member

Perf?

@Dr-Irv
Copy link
Contributor Author

Dr-Irv commented Nov 17, 2022

Perf?

Have to assume that the difference between doing isinstance(obj, (str, bytes)) and isinstance(obj, (str, bytes, _GenericAlias)) is minimal. And that our benchmarks won't see the difference.

@jbrockmendel
Copy link
Member

I'm sure it will be small, but this is a function that we've gone out of our way to optimize to the gills

@Dr-Irv
Copy link
Contributor Author

Dr-Irv commented Nov 17, 2022

I'm sure it will be small, but this is a function that we've gone out of our way to optimize to the gills

So what's the methodology to do the comparison? In CI? Locally?

And if there is a hit, then what do we do about it?

@jbrockmendel
Copy link
Member

So what's the methodology to do the comparison? In CI? Locally?

usually show some before/after results with %timeit in a few cases chosen using your best judgement

@Dr-Irv
Copy link
Contributor Author

Dr-Irv commented Nov 17, 2022

So what's the methodology to do the comparison? In CI? Locally?

usually show some before/after results with %timeit in a few cases chosen using your best judgement

OK, so I did this test
BEFORE

%timeit inference.is_list_like(set([1,2,3]))
572 ns ± 14.4 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

AFTER

%timeit inference.is_list_like(set([1,2,3]))
592 ns ± 12 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

I tested a set because it is the last thing checked in is_list_like, so we know the new test is always executed. This is with python 3.8 on my Windows machine.

Open to other tests to try.

@jbrockmendel
Copy link
Member

thanks

@lithomas1 lithomas1 closed this Nov 20, 2022
@lithomas1 lithomas1 reopened this Nov 20, 2022
@mroeschke mroeschke added this to the 1.5.3 milestone Nov 21, 2022
@mroeschke
Copy link
Member

Looks like the whatsnew note will need to move to 1.5.3

@Dr-Irv
Copy link
Contributor Author

Dr-Irv commented Nov 21, 2022

Looks like the whatsnew note will need to move to 1.5.3

That's unfortunate. I was hoping this would get in to clean up some test code in pandas-stubs. Oh, well

@Dr-Irv
Copy link
Contributor Author

Dr-Irv commented Nov 21, 2022

Looks like the whatsnew note will need to move to 1.5.3

Actually, who's responsible for creating the new 1.5.3 whatsnew template?

@mroeschke
Copy link
Member

I think the new whatsnew template is done at the end of the release process? cc @datapythonista

@datapythonista
Copy link
Member

Looks like the whatsnew note will need to move to 1.5.3

Actually, who's responsible for creating the new 1.5.3 whatsnew template?

Yes, I'll be doing it shortly, sorry for the delay.

@WillAyd
Copy link
Member

WillAyd commented Dec 10, 2022

Maybe worth updating the branch again but otherwise lgtm to merge

@Dr-Irv
Copy link
Contributor Author

Dr-Irv commented Dec 11, 2022

Maybe worth updating the branch again but otherwise lgtm to merge

I moved the whatsnew comment from 1.5.2 to 1.5.3 in a4142ad

@mroeschke
Copy link
Member

Sorry looks like another merge conflict slipped in.

Also the diff shows that doc/source/whatsnew/v1.5.2.rst is still modified (errant empty note)

@Dr-Irv
Copy link
Contributor Author

Dr-Irv commented Dec 12, 2022

Sorry looks like another merge conflict slipped in.

Also the diff shows that doc/source/whatsnew/v1.5.2.rst is still modified (errant empty note)

Should be fixed in cb6140f

Copy link
Member

@mroeschke mroeschke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the quick fix. I'll merge on green

@mroeschke mroeschke merged commit e37040a into pandas-dev:main Dec 13, 2022
@mroeschke
Copy link
Member

Thanks @Dr-Irv

@lumberbot-app
Copy link

lumberbot-app bot commented Dec 13, 2022

Owee, I'm MrMeeseeks, Look at me.

There seem to be a conflict, please backport manually. Here are approximate instructions:

  1. Checkout backport branch and update it.
git checkout 1.5.x
git pull
  1. Cherry pick the first parent branch of the this PR on top of the older branch:
git cherry-pick -x -m1 e37040a54b89ebb6a1c33dbde700b57900912ff4
  1. You will likely have some merge/cherry-pick conflict here, fix them and commit:
git commit -am 'Backport PR #49736: Fix: Treat Generic classes as not being is_list_like'
  1. Push to a named branch:
git push YOURFORK 1.5.x:auto-backport-of-pr-49736-on-1.5.x
  1. Create a PR against branch 1.5.x, I would have named this PR:

"Backport PR #49736 on branch 1.5.x (Fix: Treat Generic classes as not being is_list_like)"

And apply the correct labels and milestones.

Congratulations — you did some good work! Hopefully your backport PR will be tested by the continuous integration and merged soon!

Remember to remove the Still Needs Manual Backport label once the PR gets merged.

If these instructions are inaccurate, feel free to suggest an improvement.

Dr-Irv added a commit to Dr-Irv/pandas that referenced this pull request Dec 13, 2022
* Fix: Treat Generic classes as not being is_list_like

* tst: update test to test subclass of DataFrame

* add is_list_like test for instance of generic dataframe

* combine isinstance tests

* update whatsnew with clearer message

* Update whatsnew to reference DataFrame as a class

Co-authored-by: Matthew Roeschke <[email protected]>

* fix whatsnew blank line

* fix whatsnew issue quotes

* move whatsnew comment from 1.5.2 to 1.5.3

Co-authored-by: Matthew Roeschke <[email protected]>
(cherry picked from commit e37040a)
mroeschke pushed a commit that referenced this pull request Dec 13, 2022
… being is_list_like) (#50233)

* Fix: Treat Generic classes as not being is_list_like (#49736)

* Fix: Treat Generic classes as not being is_list_like

* tst: update test to test subclass of DataFrame

* add is_list_like test for instance of generic dataframe

* combine isinstance tests

* update whatsnew with clearer message

* Update whatsnew to reference DataFrame as a class

Co-authored-by: Matthew Roeschke <[email protected]>

* fix whatsnew blank line

* fix whatsnew issue quotes

* move whatsnew comment from 1.5.2 to 1.5.3

Co-authored-by: Matthew Roeschke <[email protected]>
(cherry picked from commit e37040a)

* Force checks

* put back import warnings in lib.pyx - got deleted when merged
@Dr-Irv Dr-Irv deleted the issue49649 branch February 13, 2023 20:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Python 3.11 Typing type annotations, mypy/pyright type checking
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: warning from typing library if you subclass a DataFrame using Generic with python 3.11
8 participants