-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Ensure isinstance checks use ABC classes where available #27353
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
If they aren't inducing otherwise-unnecessary runtime imports, the non-ABC versions are fine, aren't they? FWIW I benchmarked these a long time ago and |
Yea valid points but I'm not sure what percent of these they would apply to. Particular with pandas.core.frame (the first issue) we are only import MultiIndex for is instancechecks when we actually already have the ABCMultiIndex in there as well, so that at least is a very good candidate to change Noted on performance but we are still in the range of nanoseconds so I think OK unless stuck in a performance critical loop |
would be +1 in making imports simpler / more consistent |
Hi @WillAyd, can you explain bit more about it? I am new here and interested in it. |
Sure - for any of the failures at the bottom of the OP there should be an ABC class defined in pandas.core.dtypes.generic that you could use instead, especially if we only import the objects in the OP for isinstance checks |
@WillAyd I think I can tackle this...there may be a number of additional instances that the script is missing because it's not parsing multi-line statements. Consider Line 4615 in 5d9fd7e
But since the Lines 1737 to 1739 in 5d9fd7e
In this file, |
Thanks for looking. The script is probably secondary to actually making the changes so if you’ve looked and would like to submit a PR can take things from there |
when using the ABCs in isinstance checks the expressions resolves to Any. so every PR citing this issue is reducing the type checking and not allowing mypy to narrow types. should we have ABCs that are not created dynamically at runtime or should we only use the ABCs where necessary to eliminate the need for a cast. the contributing guide would suggest the later in the short term until we have a solution.. "the use of cast is strongly discouraged. Where applicable a refactor of the code to appease static analysis is preferable" |
also could add types to pandas.core.dtypes.generic using cast or add a stub file for pandas.core.dtypes.generic adding a stub file would be inconsistent with our current approach for a standard internal python module, but is straightforward if we don't need to type check pandas.core.dtypes.generic itself. (it's mainly dynamic in nature). |
By the way @jbrockmendel what you describe might have been resolved in Py37: https://bugs.python.org/issue31333 Haven't benchmarked personally just sharing as I came across |
I don't think so. On py 37:
(so in this case even worse than the factor 2 Brock mentioned before) |
The current list of files returned by @WillAyd's script after my PR:
In all of these files the non ABC classes references are also created or a class method is used so I didn't update them. Based on the data above I'm not sure we want to update these. What other action is left for this issue? |
If that's the remaining items then your PR can close this once merged. Thanks for reviewing @Kazz47 |
running cProfile over the test suite and sorting by "tottime",
|
@WillAyd closing this due to performance and typing concerns. feel free to reopen if you want the discussion to continue. |
Here's a quick little script I wrote to check for things like isinstance(obj, MultiIndex)
Running that from project root yields all of the following errors:
So I think these need to be swapped out for their ABC equivalents
The text was updated successfully, but these errors were encountered: