-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
STYLE use ast.parse instead of regex to fixup inconsistencies in namespace #39690
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
STYLE use ast.parse instead of regex to fixup inconsistencies in namespace #39690
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @MarcoGorelli. The logic of how this works is a bit over my head to be honest, but looks like it's catching a lot that was missed previously 👍
@@ -7,52 +7,97 @@ | |||
This is meant to be run as a pre-commit hook - to run it manually, you can do: | |||
|
|||
pre-commit run inconsistent-namespace-usage --all-files | |||
|
|||
To automatically fixup a given file, you can pass `--replace`, e.g. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a preference for how to fix the file if both usages occur?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
currently I'm just removing the pd.
prefix if both occur
@momisbestfriend can you take a look? IIRC you did a lot of the ast-based checks |
|
@@ -753,7 +754,7 @@ def test_constructor_extension_scalar_data(self, data, dtype): | |||
assert df["a"].dtype == dtype | |||
assert df["b"].dtype == dtype | |||
|
|||
arr = pd.array([data] * 2, dtype=dtype) | |||
arr = pd_array([data] * 2, dtype=dtype) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the advantage of using such an alias instead of simply using pd.array
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that in this file there is already import array
, so having pd.array
as well would fail the check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but I meant from a developer (code writer / reader) perspective ;)
I understand that the check has difficulties with it, but then I would say to either fix the check or ignore "array" in the check if it's not easy to fix (IMO array in tests should always be used as pd.array
, so it might worth a special case in the checking code)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, I can add it to EXCLUDE
(along with eval
and np
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thnks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm ex @jorisvandenbossche comments
Thanks - I believe these were addressed in d7825aa, but Joris do let me know if you had something different in mind |
i think i got into the habit of importing array as pd_array bc there are a number of places where we use "array" as a variable name (or in principle could be using the stdlib array module) and name overlap of that form bothers me more than most |
Yes correct I did a lot of AST checks:) This script looks amazing, LGTM:) |
d7825aa
to
1f6eba0
Compare
Thanks - I learned loads by studying the pyupgrade source code |
@MarcoGorelli can you merge master and ping on green. i think this is fine (and followup PRs to add coverage are good). |
Sure, have merged - @jreback EDIT I'll try re-writing this slightly so it's just standard library, then the same local virtualenv as is used for lots of other checks can be reused UPDATENow, |
Hello @MarcoGorelli! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found: There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 Comment last updated at 2021-02-18 12:12:43 UTC |
…pandas into inconsistent-namespace-ast
thanks @MarcoGorelli pls open an issue to fix the non-checked parts of the codebase. |
Thanks - I'm running a little event with PyLadies London on the 6th of March, so I'll get participants to work on this then |
The regular expression wasn't working too well, this solution should be better. Also, like this, you can pass
--replace
and have the inconsistencies fixed up for you (though I haven't passed this by default as it may be confusing)Just running on
pandas/tests/frame/
here, can expand to the rest in separate PRs if this is alright