Skip to content

STYLE use ast.parse instead of regex to fixup inconsistencies in namespace #39690

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Feb 19, 2021

Conversation

MarcoGorelli
Copy link
Member

@MarcoGorelli MarcoGorelli commented Feb 9, 2021

  • tests added / passed
  • Ensure all linting tests pass, see here for how to run them

The regular expression wasn't working too well, this solution should be better. Also, like this, you can pass --replace and have the inconsistencies fixed up for you (though I haven't passed this by default as it may be confusing)

Just running on pandas/tests/frame/ here, can expand to the rest in separate PRs if this is alright

@lithomas1 lithomas1 added the Code Style Code style, linting, code_checks label Feb 9, 2021
Copy link
Member

@dsaxton dsaxton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @MarcoGorelli. The logic of how this works is a bit over my head to be honest, but looks like it's catching a lot that was missed previously 👍

@@ -7,52 +7,97 @@
This is meant to be run as a pre-commit hook - to run it manually, you can do:

pre-commit run inconsistent-namespace-usage --all-files

To automatically fixup a given file, you can pass `--replace`, e.g.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a preference for how to fix the file if both usages occur?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

currently I'm just removing the pd. prefix if both occur

@jbrockmendel
Copy link
Member

@momisbestfriend can you take a look? IIRC you did a lot of the ast-based checks

@MarcoGorelli
Copy link
Member Author

@momisbestfriend can you take a look? IIRC you did a lot of the ast-based checks

@ShaharNaveh

@@ -753,7 +754,7 @@ def test_constructor_extension_scalar_data(self, data, dtype):
assert df["a"].dtype == dtype
assert df["b"].dtype == dtype

arr = pd.array([data] * 2, dtype=dtype)
arr = pd_array([data] * 2, dtype=dtype)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the advantage of using such an alias instead of simply using pd.array ?

Copy link
Member Author

@MarcoGorelli MarcoGorelli Feb 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that in this file there is already import array, so having pd.array as well would fail the check

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but I meant from a developer (code writer / reader) perspective ;)
I understand that the check has difficulties with it, but then I would say to either fix the check or ignore "array" in the check if it's not easy to fix (IMO array in tests should always be used as pd.array, so it might worth a special case in the checking code)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, I can add it to EXCLUDE (along with eval and np)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thnks!

@jreback jreback added this to the 1.3 milestone Feb 10, 2021
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm ex @jorisvandenbossche comments

@MarcoGorelli
Copy link
Member Author

lgtm ex @jorisvandenbossche comments

Thanks - I believe these were addressed in d7825aa, but Joris do let me know if you had something different in mind

@jbrockmendel
Copy link
Member

i think i got into the habit of importing array as pd_array bc there are a number of places where we use "array" as a variable name (or in principle could be using the stdlib array module) and name overlap of that form bothers me more than most

@ShaharNaveh
Copy link
Member

@momisbestfriend can you take a look? IIRC you did a lot of the ast-based checks

Yes correct I did a lot of AST checks:)

This script looks amazing, LGTM:)

@MarcoGorelli MarcoGorelli force-pushed the inconsistent-namespace-ast branch from d7825aa to 1f6eba0 Compare February 12, 2021 16:55
@MarcoGorelli
Copy link
Member Author

This script looks amazing

Thanks - I learned loads by studying the pyupgrade source code

@jreback
Copy link
Contributor

jreback commented Feb 15, 2021

@MarcoGorelli can you merge master and ping on green. i think this is fine (and followup PRs to add coverage are good).

@MarcoGorelli
Copy link
Member Author

MarcoGorelli commented Feb 16, 2021

Sure, have merged - @jreback

EDIT I'll try re-writing this slightly so it's just standard library, then the same local virtualenv as is used for lots of other checks can be reused

UPDATE

Now, tokenize-rt is only required if using the --replace flag. This way, the pre-commit hook can re-use the same virtualenv as other local Python hooks

@MarcoGorelli MarcoGorelli requested a review from jreback February 16, 2021 15:19
@pep8speaks
Copy link

pep8speaks commented Feb 16, 2021

Hello @MarcoGorelli! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2021-02-18 12:12:43 UTC

@MarcoGorelli MarcoGorelli marked this pull request as draft February 16, 2021 17:02
@MarcoGorelli MarcoGorelli marked this pull request as ready for review February 16, 2021 18:33
@MarcoGorelli MarcoGorelli marked this pull request as draft February 16, 2021 18:35
@MarcoGorelli MarcoGorelli marked this pull request as ready for review February 16, 2021 20:55
@jreback jreback merged commit de55c3d into pandas-dev:master Feb 19, 2021
@jreback
Copy link
Contributor

jreback commented Feb 19, 2021

thanks @MarcoGorelli pls open an issue to fix the non-checked parts of the codebase.

@MarcoGorelli MarcoGorelli deleted the inconsistent-namespace-ast branch February 19, 2021 07:14
@MarcoGorelli
Copy link
Member Author

thanks @MarcoGorelli pls open an issue to fix the non-checked parts of the codebase.

Thanks - I'm running a little event with PyLadies London on the 6th of March, so I'll get participants to work on this then

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Code Style Code style, linting, code_checks
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants