CLN: Condense PR style checklist into one script #30773

gfyoung · 2020-01-07T07:22:50Z

A script is easier to execute and manage as we manage our style-checking tools.

Started with flake8, black, and isort, as those are the main ones for Python-related changes (for reference, we didn't even have isort in the checklist beforehand).

If we're happy with the structure, we can always add more OR modify checks if we want going forward while keeping it easy for folks to check their PR's.

datapythonista · 2020-01-07T10:35:43Z

I like the idea, but shouldn't we reuse ci/code_checks.sh? The new script feels like a subset of it. If we don't want all the checks there, we can probably create a new category to run only what we want (but I think running all the checks is probably ok).

gfyoung · 2020-01-07T10:42:43Z

Not quite a subset. As the echo messages indicate, the way we're running them in the script allows the users to fix mistakes simultaneously checks for black and isort.

code_checks.sh, however, is purely for correctness (i.e. you wouldn't get the auto-correct) and contains a ton more checks (e.g. doctests, cpplint) that IMO are probably not necessary on average.

I'm not necessarily against combining in the future, but given the differences in what this script does vs. code_checks, I think that's a little more acrobatics than I would want to do for this PR.

datapythonista · 2020-01-07T10:48:00Z

I see, I misunderstood that. Makes sense then. But may be the name *_checks is a bit misleading if the script is actually performing a the changes to the code. Do you think it makes sense to rename it to something like fix_code_style or something similar?

gfyoung · 2020-01-07T10:55:12Z

Do you think it makes sense to rename it to something like fix_code_style or something similar?

I get your point. It's a hybrid though, as flake8 does not actually fix your errors, only lists them out.

datapythonista · 2020-01-07T11:01:54Z

Not a big deal, but I think it's better to expect a script to change code, and that it possibly doesn't (even black and isort won't in many cases). Than have a script to check your code, that changes it unexpectedly. I don't even like that we need to use black and isort to change automatically our code. Having a script that does that in a somehow (unintentionally) sneaky way would be really annoying to me.

gfyoung · 2020-01-07T11:04:38Z

Not a big deal, but I think it's better to expect a script to change code, and that it possibly doesn't (even black and isort won't in many cases). Than have a script to check your code, that changes it unexpectedly.

True (though the echo messages mitigate the surprise I might argue).

How about quick_style_check_and_fix.sh then (to reflect that we're doing both)?

datapythonista · 2020-01-07T11:16:08Z

though the echo messages mitigate the surprise

Imagine a case where you didn't commit the changes, you run the checks to see if there is anything you missed, and then you realize that the checks made black actually destroyed all your styles. The surprise would still be there. ;)

Anyway, quick_style_check_and_fix.sh sounds good, and without the quick too to make it a bit shorter.

In a follow up I may check if it's easy to display different messages depending on whether changes need to be committed or not, so users of the script don't need to read and think too much.

TomAugspurger · 2020-01-07T12:04:01Z

@gfyoung are you using pre-commit? This script duplicates what it does.

gfyoung · 2020-01-07T20:32:07Z

@TomAugspurger : I don't actually use it because the mypy check takes some time to run on my machine. You do raise a good point that there is overlap due to the other checks.

Unless we decide to make it a required thing on the PR checklist to use pre-commit, I think the script can allow us to be more comprehensive with the checklist without having to enumerate all of the different checks we run (even pre-commit is missing at least one: cpplint)

TomAugspurger · 2020-01-07T20:38:05Z

I'd vote for telling people to use pre-commit, and including whatever checks we want in there.

…

On Tue, Jan 7, 2020 at 2:32 PM gfyoung ***@***.***> wrote: @TomAugspurger <https://github.com/TomAugspurger> : I don't actually because the mypy check takes some time to run on my machine. You do raise a good point that there is overlap due to the other checks. Unless we decide to make it a required thing on the PR checklist to use pre-commit, I think the script can allow us to be more comprehensive with the checklist without having to enumerate all of the different checks we run (even pre-commit is missing at least one: cpplint) — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#30773?email_source=notifications&email_token=AAKAOIRIFCDE5O2G73OTOJTQ4TRERA5CNFSM4KDTQRW2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIKFQEY#issuecomment-571758611>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAKAOIRCXDKBJEOTLOK5FR3Q4TRERANCNFSM4KDTQRWQ> .

gfyoung · 2020-01-07T20:53:38Z

cc @pandas-dev/pandas-core : Thoughts?

The question is whether we want people to go in on using pre-commit and thereby remove the redundancy we have with the PR checklist altogether. My only concern with pre-commit (as is) is that it can slow down development (see #30773 (comment)), especially if the changes are small.

Alternatively, we condense the lint checkboxes into a single script as I have done here. The downside is that we still have some redundancy in maintaining linting checks.

.github/PULL_REQUEST_TEMPLATE.md

TomAugspurger · 2020-01-07T21:18:04Z

My only concern with pre-commit (as is) is that it can slow down development (see #30773 (comment)), especially if the changes are small.

If we're concerned about the speed of the pre-commit hooks, we can discuss which items we want included by default. I don't have a strong opinion. Perhaps @xhochy can share his thoughts on he thought including mypy was a good idea.

xhochy · 2020-01-07T21:40:50Z

I would also strongly argue that people should just pre-commit instead of an additional script as it takes care of a lot of pitfalls one can run into. Normally I have all static checks in pre-commit add its an easy-to-setup way to ensure that people don't push code that will then fail on public CI but still consume resources there.

I first also thought that mypy may be too slow but with a pre-filled cache, it is running really fast for me. Could be that it is only fast due my hardware and the cache is not fast enough for others, then it is probably better to remove mypy again until we get a better incremental mode for to work. People not using pre-commit because of this would be even worse.

gfyoung · 2020-01-07T21:42:03Z

Could be that it is only fast due my hardware and the cache is not fast enough for others, then it is probably better to remove mypy again until we get a better incremental mode for to work.

That would be a reasonable compromise.

TomAugspurger · 2020-01-07T21:54:56Z

Anecdotally, I'm getting used to mypy running after a day or two, but it slightly annoyed me at first 😄. The initial check after changing branches can feel slow, but subsequent ones are much faster. So I'm happy either way.

gfyoung · 2020-01-07T22:13:39Z

So I'm happy either way.

Sounds good. I'll open a new PR to move in this new direction then.

xhochy · 2020-01-08T06:29:35Z

@gfyoung @TomAugspurger Can you give me a rough number how much time the mypy check with a pre-filled cache took for you?

jorisvandenbossche · 2020-01-08T08:16:56Z

I have to say I have been quite annoyed the last days with the slowness of the commit hook now mypy was added. Is the caching normally working out of the box, or do I need to do something for that? But I have been switching a lot between branches, so that might a reason to be more annoyed.

The mypy step in the pre-commit hook on a fresh branch takes around 20s for me.

jorisvandenbossche · 2020-01-08T08:21:33Z

Also a second time on the same branch, it still takes a long time (around 20s), so even with caching, it is still annoyingly slow for committing IMO

xhochy · 2020-01-08T09:15:19Z

Removed it over at #30811

For me the timings are 18s without and 0.6s with a cache, so it seemed appropriate to me. I wasn't aware that the impact could be this big on different setups.

jorisvandenbossche · 2020-01-08T09:20:40Z

Or my cache is not working properly for some reason? Although I see a .mypy_cache directory in the pandas repo, with files changed at the moment I did the pre-commit tests I mentioned above.

xhochy · 2020-01-08T09:24:43Z

Or my cache is not working properly for some reason? Although I see a .mypy_cache directory in the pandas repo, with files changed at the moment I did the pre-commit tests I mentioned above.

In my case only @plugins_snapshot.json gets changed, all other files stay the same.

gfyoung · 2020-01-08T09:43:57Z

Can you give me a rough number how much time the mypy check with a pre-filled cache took for you?

Similar to what @jorisvandenbossche gave.

TomAugspurger · 2020-01-08T12:13:02Z

Mine were closer to Uwe's.

…

On Wed, Jan 8, 2020 at 3:43 AM gfyoung ***@***.***> wrote: Can you give me a rough number how much time the mypy check with a pre-filled cache took for you? Similar to what @jorisvandenbossche <https://github.com/jorisvandenbossche> gave <#30773 (comment)>. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#30773?email_source=notifications&email_token=AAKAOITEZSLRATO57W7U2W3Q4WN57A5CNFSM4KDTQRW2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEILZEGI#issuecomment-571970073>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAKAOIX6IYQD44J7VS3SXDDQ4WN57ANCNFSM4KDTQRWQ> .

jbrockmendel · 2020-01-08T18:26:42Z

my mypy runtimes are closer to Joris's

gfyoung · 2020-01-08T22:34:52Z

Closing this so we can focus on #30814

Previously, we stated it as merely optional xref: #30773 #30814

gfyoung added the Clean label Jan 7, 2020

gfyoung added this to the 1.0 milestone Jan 7, 2020

gfyoung force-pushed the easy-pr-check-script branch from 5b88c4e to d3efe65 Compare January 7, 2020 07:23

jschendel reviewed Jan 7, 2020

View reviewed changes

.github/PULL_REQUEST_TEMPLATE.md Show resolved Hide resolved

CLN: Condense PR style checklist into one script

a483da4

gfyoung force-pushed the easy-pr-check-script branch from d3efe65 to a483da4 Compare January 8, 2020 06:21

jorisvandenbossche removed this from the 1.0 milestone Jan 8, 2020

gfyoung added this to the No action milestone Jan 8, 2020

gfyoung closed this Jan 8, 2020

gfyoung deleted the easy-pr-check-script branch January 8, 2020 22:35

gfyoung mentioned this pull request Jan 9, 2020

DOC: Encourage use of pre-commit in the docs #30864

Merged

gfyoung added a commit that referenced this pull request Jan 10, 2020

DOC: Encourage use of pre-commit in the docs (#30864)

d1b9598

Previously, we stated it as merely optional xref: #30773 #30814

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CLN: Condense PR style checklist into one script #30773

CLN: Condense PR style checklist into one script #30773

gfyoung commented Jan 7, 2020 •

edited

Loading

datapythonista commented Jan 7, 2020

gfyoung commented Jan 7, 2020 •

edited

Loading

datapythonista commented Jan 7, 2020

gfyoung commented Jan 7, 2020

datapythonista commented Jan 7, 2020

gfyoung commented Jan 7, 2020 •

edited

Loading

datapythonista commented Jan 7, 2020

TomAugspurger commented Jan 7, 2020

gfyoung commented Jan 7, 2020 •

edited

Loading

TomAugspurger commented Jan 7, 2020 via email

gfyoung commented Jan 7, 2020 •

edited

Loading

TomAugspurger commented Jan 7, 2020

xhochy commented Jan 7, 2020

gfyoung commented Jan 7, 2020

TomAugspurger commented Jan 7, 2020

gfyoung commented Jan 7, 2020

xhochy commented Jan 8, 2020 •

edited

Loading

jorisvandenbossche commented Jan 8, 2020 •

edited

Loading

jorisvandenbossche commented Jan 8, 2020

xhochy commented Jan 8, 2020 •

edited

Loading

jorisvandenbossche commented Jan 8, 2020

xhochy commented Jan 8, 2020

gfyoung commented Jan 8, 2020

TomAugspurger commented Jan 8, 2020 via email

jbrockmendel commented Jan 8, 2020

gfyoung commented Jan 8, 2020

CLN: Condense PR style checklist into one script #30773

CLN: Condense PR style checklist into one script #30773

Conversation

gfyoung commented Jan 7, 2020 • edited Loading

datapythonista commented Jan 7, 2020

gfyoung commented Jan 7, 2020 • edited Loading

datapythonista commented Jan 7, 2020

gfyoung commented Jan 7, 2020

datapythonista commented Jan 7, 2020

gfyoung commented Jan 7, 2020 • edited Loading

datapythonista commented Jan 7, 2020

TomAugspurger commented Jan 7, 2020

gfyoung commented Jan 7, 2020 • edited Loading

TomAugspurger commented Jan 7, 2020 via email

gfyoung commented Jan 7, 2020 • edited Loading

TomAugspurger commented Jan 7, 2020

xhochy commented Jan 7, 2020

gfyoung commented Jan 7, 2020

TomAugspurger commented Jan 7, 2020

gfyoung commented Jan 7, 2020

xhochy commented Jan 8, 2020 • edited Loading

jorisvandenbossche commented Jan 8, 2020 • edited Loading

jorisvandenbossche commented Jan 8, 2020

xhochy commented Jan 8, 2020 • edited Loading

jorisvandenbossche commented Jan 8, 2020

xhochy commented Jan 8, 2020

gfyoung commented Jan 8, 2020

TomAugspurger commented Jan 8, 2020 via email

jbrockmendel commented Jan 8, 2020

gfyoung commented Jan 8, 2020

gfyoung commented Jan 7, 2020 •

edited

Loading

gfyoung commented Jan 7, 2020 •

edited

Loading

gfyoung commented Jan 7, 2020 •

edited

Loading

gfyoung commented Jan 7, 2020 •

edited

Loading

gfyoung commented Jan 7, 2020 •

edited

Loading

xhochy commented Jan 8, 2020 •

edited

Loading

jorisvandenbossche commented Jan 8, 2020 •

edited

Loading

xhochy commented Jan 8, 2020 •

edited

Loading