Skip to content

STYLE: Fixing and refactoring linting #22863

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 19 commits into from
Oct 9, 2018
Merged

STYLE: Fixing and refactoring linting #22863

merged 19 commits into from
Oct 9, 2018

Conversation

datapythonista
Copy link
Member

@datapythonista datapythonista commented Sep 27, 2018

  • Wrong path pandas/src corrected (.px.in was not being tested)
  • Made more compact and readable
  • Unified the flake8 errors that are being ignored
  • Moving doctests and check_imports to lint.sh
  • Moving linting from py2.7 to py3.6

@datapythonista datapythonista added the CI Continuous Integration label Sep 27, 2018
@datapythonista
Copy link
Member Author

There is something weird in travis. When I run ./ci/lint.sh locally, I've got errors for all the E722 do not use bare except in python files. But those seem to be ignored when the script runs in Travis. When I tested the linting in azure, I've also got the errors.

@jreback @TomAugspurger any idea why is this happening? I couldn't see anything on the travis configuration or in setup.cfg that explains this.

Also, which is the desired behavior, do we want to allow base except?

inside ./ci/lint.sh Linting .py code pandas\compat\pickle_compat.py:36:13: E722 do not use bare except' pandas\compat\pickle_compat.py:47:13: E722 do not use bare except' pandas\compat\pickle_compat.py:185:1: E722 do not use bare except' pandas\compat\pickle_compat.py:213:5: E722 do not use bare except' pandas\core\frame.py:3263:13: E722 do not use bare except' pandas\core\frame.py:7706:9: E722 do not use bare except' pandas\core\indexing.py:2149:9: E722 do not use bare except' pandas\core\indexing.py:2708:13: E722 do not use bare except' pandas\core\indexing.py:2714:9: E722 do not use bare except' pandas\core\nanops.py:506:13: E722 do not use bare except' pandas\core\nanops.py:818:13: E722 do not use bare except' pandas\core\ops.py:1548:17: E722 do not use bare except' pandas\core\window.py:2507:5: E722 do not use bare except' pandas\core\computation\pytables.py:414:9: E722 do not use bare except' pandas\core\dtypes\common.py:470:5: E722 do not use bare except' pandas\core\dtypes\dtypes.py:361:9: E722 do not use bare except' pandas\core\dtypes\dtypes.py:522:13: E722 do not use bare except' pandas\core\indexes\frozen.py:142:9: E722 do not use bare except' pandas\core\indexes\multi.py:1005:17: E722 do not use bare except' pandas\core\indexes\multi.py:1010:17: E722 do not use bare except' pandas\core\indexes\multi.py:1689:9: E722 do not use bare except' pandas\core\indexes\multi.py:2318:17: E722 do not use bare except' pandas\core\indexes\multi.py:2821:17: E722 do not use bare except' pandas\core\internals\blocks.py:669:9: E722 do not use bare except' pandas\core\internals\blocks.py:1145:9: E722 do not use bare except' pandas\core\internals\blocks.py:1160:9: E722 do not use bare except' pandas\core\internals\blocks.py:2441:13: E722 do not use bare except' pandas\core\internals\blocks.py:3175:9: E722 do not use bare except' pandas\core\sparse\array.py:309:9: E722 do not use bare except' pandas\core\tools\datetimes.py:244:17: E722 do not use bare except' pandas\core\tools\datetimes.py:334:9: E722 do not use bare except' pandas\core\tools\datetimes.py:731:5: E722 do not use bare except' pandas\core\tools\datetimes.py:738:5: E722 do not use bare except' pandas\core\tools\datetimes.py:745:5: E722 do not use bare except' pandas\io\clipboards.py:45:9: E722 do not use bare except' pandas\io\packers.py:706:13: E722 do not use bare except' pandas\io\parsers.py:1811:9: E722 do not use bare except' pandas\io\parsers.py:3037:13: E722 do not use bare except' pandas\io\parsers.py:3266:9: E722 do not use bare except' pandas\io\parsers.py:3287:9: E722 do not use bare except' pandas\io\parsers.py:3291:9: E722 do not use bare except' pandas\io\pickle.py:172:13: E722 do not use bare except' pandas\io\pickle.py:177:5: E722 do not use bare except' pandas\io\pytables.py:261:9: E722 do not use bare except' pandas\io\pytables.py:398:5: E722 do not use bare except' pandas\io\pytables.py:402:9: E722 do not use bare except' pandas\io\pytables.py:520:9: E722 do not use bare except' pandas\io\pytables.py:678:17: E722 do not use bare except' pandas\io\pytables.py:1164:9: E722 do not use bare except' pandas\io\pytables.py:1273:9: E722 do not use bare except' pandas\io\pytables.py:1310:17: E722 do not use bare except' pandas\io\pytables.py:1321:13: E722 do not use bare except' pandas\io\pytables.py:1357:17: E722 do not use bare except' pandas\io\pytables.py:1362:9: E722 do not use bare except' pandas\io\pytables.py:1627:9: E722 do not use bare except' pandas\io\pytables.py:1659:9: E722 do not use bare except' pandas\io\pytables.py:1872:9: E722 do not use bare except' pandas\io\pytables.py:2235:17: E722 do not use bare except' pandas\io\pytables.py:2328:9: E722 do not use bare except' pandas\io\pytables.py:2772:17: E722 do not use bare except' pandas\io\pytables.py:2846:9: E722 do not use bare except' pandas\io\pytables.py:2964:9: E722 do not use bare except' pandas\io\pytables.py:3498:13: E722 do not use bare except' pandas\io\pytables.py:3617:17: E722 do not use bare except' pandas\io\pytables.py:3645:17: E722 do not use bare except' pandas\io\pytables.py:4463:5: E722 do not use bare except' pandas\io\pytables.py:4785:13: E722 do not use bare except' pandas\io\sql.py:385:5: E722 do not use bare except' pandas\io\sql.py:850:13: E722 do not use bare except' pandas\io\sql.py:1363:9: E722 do not use bare except' pandas\io\stata.py:1255:9: E722 do not use bare except' pandas\io\stata.py:1260:9: E722 do not use bare except' pandas\io\formats\console.py:103:5: E722 do not use bare except' pandas\io\formats\console.py:121:5: E722 do not use bare except' pandas\io\formats\console.py:140:5: E722 do not use bare except' pandas\io\formats\console.py:152:5: E722 do not use bare except' pandas\io\formats\terminal.py:81:5: E722 do not use bare except' pandas\io\formats\terminal.py:111:5: E722 do not use bare except' pandas\io\formats\terminal.py:123:9: E722 do not use bare except' pandas\io\formats\terminal.py:132:9: E722 do not use bare except' pandas\io\formats\terminal.py:138:9: E722 do not use bare except' pandas\io\sas\sasreader.py:49:9: E722 do not use bare except' pandas\io\sas\sas_xport.py:249:13: E722 do not use bare except' pandas\tests\test_multilevel.py:1378:9: E722 do not use bare except' pandas\tests\test_nanops.py:144:9: E722 do not use bare except' pandas\tests\test_nanops.py:149:9: E722 do not use bare except' pandas\tests\test_nanops.py:170:21: E722 do not use bare except' pandas\tests\test_nanops.py:174:21: E722 do not use bare except' pandas\tests\test_panel.py:338:13: E722 do not use bare except' pandas\tests\test_panel.py:344:13: E722 do not use bare except' pandas\tests\test_strings.py:2663:13: E722 do not use bare except' pandas\tests\indexing\common.py:154:13: E722 do not use bare except' pandas\tests\indexing\common.py:217:17: E722 do not use bare except' pandas\tests\io\test_pytables.py:54:9: E722 do not use bare except' pandas\tests\io\test_pytables.py:62:5: E722 do not use bare except' pandas\tests\io\test_pytables.py:120:5: E722 do not use bare except' pandas\tests\io\test_pytables.py:4624:21: E722 do not use bare except' pandas\tests\io\test_sql.py:1793:9: E722 do not use bare except' pandas\tests\io\test_sql.py:2378:9: E722 do not use bare except' pandas\tests\io\test_sql.py:2405:9: E722 do not use bare except' pandas\tests\io\formats\test_format.py:73:5: E722 do not use bare except' pandas\tests\io\formats\test_format.py:455:13: E722 do not use bare except' pandas\tseries\holiday.py:297:5: E722 do not use bare except' pandas\tseries\holiday.py:429:9: E722 do not use bare except' pandas\tseries\holiday.py:438:9: E722 do not use bare except' pandas\util\_print_versions.py:24:9: E722 do not use bare except' pandas\util\_print_versions.py:53:5: E722 do not use bare except' pandas\util\_print_versions.py:111:9: E722 do not use bare except' pandas\util\_validators.py:62:9: E722 do not use bare except' Linting .py code DONE Linting .pyx code Linting .pyx code DONE Linting .pxd and .pxi.in Linting .pxd and .pxi.in DONE Linting setup.py Linting setup.py DONE Linting scripts Linting scripts DONE Linting asv benchmarks Linting asv benchmarks DONE Linting doc scripts doc/source/conf.py:568:9: E722 do not use bare except' doc/source/conf.py:573:5: E722 do not use bare except' doc/source/conf.py:580:5: E722 do not use bare except' Linting doc scripts DONE Linting .c and .h Linting .c and .h DONE Check for non-standard imports Check for non-standard imports DONE Check for pytest warns Check for pytest warns DONE Check for invalid testing Check for invalid testing DONE Check for invalid EA testing Check for invalid EA testing DONE Check for deprecated messages without sphinx directive Check for deprecated messages without sphinx directive DONE Check for old-style classes Check for old-style classes DONE Check for backticks incorrectly rendering because of missing spaces Check for backticks incorrectly rendering because of missing spaces DONE Check for incorrect sphinx directives Check for incorrect sphinx directives DONE

@jreback
Copy link
Contributor

jreback commented Sep 28, 2018

we fix the pytest version currently
there is an issue about this 3.4) i think

fixing these is separate and out of scope for this issue

@datapythonista
Copy link
Member Author

This is the issue: #18419, thanks for pointing out.

Agree that's for a different issue.

@datapythonista
Copy link
Member Author

The flake8 errors that we were ignoring was a bit inconsistent. I changed it, so we always ignore the same (the ones in setup.cfg), and I removed the ones that weren't required. I explicitly add E722 (bare except) to the list of ignored ones. I guess eventually we can start validating it, and some of the others.

This PR should be ready to be merged from my side.

@datapythonista datapythonista added the Code Style Code style, linting, code_checks label Sep 28, 2018
@datapythonista datapythonista changed the title Refactoring lint.sh STYLE: Fixing and refactoring linting Sep 28, 2018
@pep8speaks
Copy link

Hello @datapythonista! Thanks for updating the PR.

@codecov
Copy link

codecov bot commented Sep 29, 2018

Codecov Report

Merging #22863 into master will decrease coverage by <.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #22863      +/-   ##
==========================================
- Coverage   92.19%   92.18%   -0.01%     
==========================================
  Files         169      169              
  Lines       50827    50830       +3     
==========================================
  Hits        46860    46860              
- Misses       3967     3970       +3
Flag Coverage Δ
#multiple 90.6% <ø> (-0.01%) ⬇️
#single 42.37% <ø> (-0.01%) ⬇️
Impacted Files Coverage Δ
pandas/plotting/_core.py 83.35% <0%> (-0.2%) ⬇️
pandas/plotting/_compat.py 91.3% <0%> (+0.39%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d115900...adcda47. Read the comment docs.

@codecov
Copy link

codecov bot commented Sep 29, 2018

Codecov Report

Merging #22863 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master   #22863   +/-   ##
=======================================
  Coverage   92.19%   92.19%           
=======================================
  Files         169      169           
  Lines       50904    50904           
=======================================
  Hits        46933    46933           
  Misses       3971     3971
Flag Coverage Δ
#multiple 90.61% <ø> (ø) ⬆️
#single 42.31% <ø> (-0.01%) ⬇️
Impacted Files Coverage Δ
pandas/core/indexing.py 93.87% <0%> (-0.01%) ⬇️
pandas/io/packers.py 88.04% <0%> (ø) ⬆️
pandas/core/frame.py 97.11% <0%> (ø) ⬆️
pandas/core/indexes/range.py 95.73% <0%> (ø) ⬆️
pandas/io/pytables.py 92.44% <0%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a4482db...8738e80. Read the comment docs.

@datapythonista
Copy link
Member Author

@jorisvandenbossche I moved the doctests to the lint.sh script in this case. Hopefully soon we'll have all them passing, and we should be able to run all them with a single command, and not file by file. Also, some of the code in lint.sh is the same as in doctests.sh, so it made sense to me.

Can you let me know if this looks ok.

@pandas-dev/pandas-core after this is merged, my idea is to move the linting to azure, as the tests in #22854 IMO show a much better interface to identify the errors than the travis logs: https://dev.azure.com/pandas-dev/pandas/_build/results?buildId=367&view=logs

And I'm still researching how to do it, but we should be able to show the errors without entering the logs (where now says Bash exited with code '1'., and also link to the line in the code where the linting problem happened).

Let me know if there is any reason I'm missing to not make the change.

- E731, # do not assign a lambda expression, use a def
- W503 # line break before binary operator
- E741, # ambiguous variable name 'l'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, this seems like a good change to me! Quick question I recently actually enforced E741 in pep8speaks config, #22795. Hence this error will be flagged on any new PRs which is the behaviour we want? I was going to follow up remove any final instances of this in codebase. See #22122

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for now is important to have consistency between the CI and pep8speaks, and among all the different files. I don't have a strong opinion about E741, or the errors discussed in #22122, in general I think the less exceptions we have, the better.

@datapythonista datapythonista mentioned this pull request Sep 30, 2018
2 tasks
ci/lint.sh Outdated
echo "NOT Linting"
fi

### LINTING ###
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so I would rather have these as 3 scripts, then a top-level script to call each one. That way we can call these individually (e.g. in the Makefile)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean one script for linting (flake8+cpplint), one for the checks (grep) and one for the doctests, right? I think that makes sense, but considering we may want to move this to azure, I see different options:

  • Having 3 scripts as you say
  • Having one, but with a parameter (e.g. ./ci/lint.sh [lint|checks|doctests]). I think this makes things a bit simpler and more compact (as the 3 scripts will be very similar)
  • Leave it as it is, and move in a separate PR the content of this file to azure-pipelines.yml

You'll know better than me how often these checks are run locally or out of the CI, but as now flake8 is mainly flake8 . and the doctests will soon be flake 8 . --doctests I think it can make sense to just have them in azure. May be a script just for the grep checks if you think those are being run locally?

@@ -8,6 +8,8 @@ dependencies:
- dask
- fastparquet
- feather-format
- flake8
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we have a reference in the docs to our version of supported flake. This should be >= 3.5?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does seem like we have any reference. A grep to the whole source code just shows me references to flake8 in this section: https://pandas.pydata.org/pandas-docs/stable/contributing.html#python-pep8

And I don't think now it makes sense to add it. Before the change flake8 had to be 3.4.1 (or older), or the linting failed. Now, the linting should work no matter the version of flake8. Older versions check less errors, but that's a detail of flake8, not about pandas.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better to actually have this requirement

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, no problem, added

@datapythonista
Copy link
Member Author

@jreback I finally added a parameter to the script (I think there are already too many files in ci/ to add 3 more), so every section can be called independently. I also renamed it from ci/lint.sh to ci/code_checks.sh as (technically speaking) linting is just one of the things that it does. So:

  • ./ci/code_checks.sh <- runs everything
  • ./ci/code_checks.sh lint <- flake8 and cpplint
  • ./ci/code_checks.sh patterns <- greps
  • ./ci/code_checks.sh doctests <- doctests

Let me know if this makes sense, or if there is anything else before this can be merged. Thanks

@@ -8,6 +8,8 @@ dependencies:
- dask
- fastparquet
- feather-format
- flake8
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better to actually have this requirement

MSG='Doctests frame.py' ; echo $MSG
pytest --doctest-modules -v pandas/core/frame.py \
-k"-axes -combine -itertuples -join -nlargest -nsmallest -nunique -pivot_table -quantile -query -reindex -reindex_axis -replace -round -set_index -stack -to_dict -to_stata"
RET=$(($RET + $?)) ; echo $MSG "DONE"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't see a response on splitting this file up into 3 separate files, you can make a sub-dir fi that is helpful and have a top-level script to call them.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I mentioned above. What I did is make the script able to run for a specific section. That keeps the advantage of simplicity, and addresses your concerns of being able to run just part of it in Makefile or manually. These are the calls:

  • ./ci/code_checks.sh lint <- flake8 and cpplint
  • ./ci/code_checks.sh patterns <- greps
  • ./ci/code_checks.sh doctests <- doctests

@datapythonista
Copy link
Member Author

@jreback we have several PRs coming that fix doctests, and change ci/doctests.sh which has been removed here. Can you take a look and see if you're happy with the approach of using parameters for each part of the linting instead of splitting the file (it's described above)? I need to keep resolving conflicts if we don't merge this.

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. @datapythonista just some doc comments. let's make sure all green as well.

@@ -0,0 +1,131 @@
#!/bin/bash

echo "inside $0"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add some docs here on how to run things, maybe also in the contributing docs themselves.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, that makes a lot of sense, thank, didn't think about that before. Updated the PR

@jreback
Copy link
Contributor

jreback commented Oct 9, 2018

@datapythonista lgtm to me now. just failing the checks:https://travis-ci.org/pandas-dev/pandas/jobs/439177405

can merge on green.

@jreback jreback added this to the 0.24.0 milestone Oct 9, 2018
@datapythonista datapythonista merged commit f6da1f1 into pandas-dev:master Oct 9, 2018
tm9k1 pushed a commit to tm9k1/pandas that referenced this pull request Nov 19, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI Continuous Integration Code Style Code style, linting, code_checks
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants