-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
STYLE: Fixing and refactoring linting #22863
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
1811954
6ace7ef
a28a612
af76f11
07ebcd0
34e85f0
adcda47
c129197
803b217
62a8813
720f18c
cabe353
9a76f58
f4e17ad
56eca8f
879157f
0ec16fe
cd38de3
8738e80
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
This file was deleted.
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,145 @@ | ||
#!/bin/bash | ||
# | ||
# Run checks related to code quality. | ||
# | ||
# This script is intended for both the CI and to check locally that code standards are | ||
# respected. We are currently linting (PEP-8 and similar), looking for patterns of | ||
# common mistakes (sphinx directives with missing blank lines, old style classes, | ||
# unwanted imports...), and we also run doctests here (currently some files only). | ||
# In the future we may want to add the validation of docstrings and other checks here. | ||
# | ||
# Usage: | ||
# $ ./ci/code_checks.sh # run all checks | ||
# $ ./ci/code_checks.sh lint # run linting only | ||
# $ ./ci/code_checks.sh patterns # check for patterns that should not exist | ||
# $ ./ci/code_checks.sh doctests # run doctests | ||
|
||
echo "inside $0" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. can you add some docs here on how to run things, maybe also in the contributing docs themselves. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yep, that makes a lot of sense, thank, didn't think about that before. Updated the PR |
||
[[ $LINT ]] || { echo "NOT Linting. To lint use: LINT=true $0 $1"; exit 0; } | ||
[[ -z "$1" || "$1" == "lint" || "$1" == "patterns" || "$1" == "doctests" ]] || { echo "Unkown command $1. Usage: $0 [lint|patterns|doctests]"; exit 9999; } | ||
|
||
source activate pandas | ||
RET=0 | ||
CHECK=$1 | ||
|
||
|
||
### LINTING ### | ||
if [[ -z "$CHECK" || "$CHECK" == "lint" ]]; then | ||
|
||
# `setup.cfg` contains the list of error codes that are being ignored in flake8 | ||
|
||
echo "flake8 --version" | ||
flake8 --version | ||
|
||
# pandas/_libs/src is C code, so no need to search there. | ||
MSG='Linting .py code' ; echo $MSG | ||
flake8 . | ||
RET=$(($RET + $?)) ; echo $MSG "DONE" | ||
|
||
MSG='Linting .pyx code' ; echo $MSG | ||
flake8 pandas --filename=*.pyx --select=E501,E302,E203,E111,E114,E221,E303,E128,E231,E126,E265,E305,E301,E127,E261,E271,E129,W291,E222,E241,E123,F403,C400,C401,C402,C403,C404,C405,C406,C407,C408,C409,C410,C411 | ||
RET=$(($RET + $?)) ; echo $MSG "DONE" | ||
|
||
MSG='Linting .pxd and .pxi.in' ; echo $MSG | ||
flake8 pandas/_libs --filename=*.pxi.in,*.pxd --select=E501,E302,E203,E111,E114,E221,E303,E231,E126,F403 | ||
RET=$(($RET + $?)) ; echo $MSG "DONE" | ||
|
||
# readability/casting: Warnings about C casting instead of C++ casting | ||
# runtime/int: Warnings about using C number types instead of C++ ones | ||
# build/include_subdir: Warnings about prefacing included header files with directory | ||
|
||
# We don't lint all C files because we don't want to lint any that are built | ||
# from Cython files nor do we want to lint C files that we didn't modify for | ||
# this particular codebase (e.g. src/headers, src/klib, src/msgpack). However, | ||
# we can lint all header files since they aren't "generated" like C files are. | ||
MSG='Linting .c and .h' ; echo $MSG | ||
cpplint --quiet --extensions=c,h --headers=h --recursive --filter=-readability/casting,-runtime/int,-build/include_subdir pandas/_libs/src/*.h pandas/_libs/src/parser pandas/_libs/ujson pandas/_libs/tslibs/src/datetime | ||
RET=$(($RET + $?)) ; echo $MSG "DONE" | ||
|
||
fi | ||
|
||
### PATTERNS ### | ||
if [[ -z "$CHECK" || "$CHECK" == "patterns" ]]; then | ||
|
||
# Check for imports from pandas.core.common instead of `import pandas.core.common as com` | ||
MSG='Check for non-standard imports' ; echo $MSG | ||
! grep -R --include="*.py*" -E "from pandas.core.common import " pandas | ||
RET=$(($RET + $?)) ; echo $MSG "DONE" | ||
|
||
MSG='Check for pytest warns' ; echo $MSG | ||
! grep -r -E --include '*.py' 'pytest\.warns' pandas/tests/ | ||
RET=$(($RET + $?)) ; echo $MSG "DONE" | ||
|
||
# Check for the following code in testing: `np.testing` and `np.array_equal` | ||
MSG='Check for invalid testing' ; echo $MSG | ||
! grep -r -E --include '*.py' --exclude testing.py '(numpy|np)(\.testing|\.array_equal)' pandas/tests/ | ||
RET=$(($RET + $?)) ; echo $MSG "DONE" | ||
|
||
# Check for the following code in the extension array base tests: `tm.assert_frame_equal` and `tm.assert_series_equal` | ||
MSG='Check for invalid EA testing' ; echo $MSG | ||
! grep -r -E --include '*.py' --exclude base.py 'tm.assert_(series|frame)_equal' pandas/tests/extension/base | ||
RET=$(($RET + $?)) ; echo $MSG "DONE" | ||
|
||
MSG='Check for deprecated messages without sphinx directive' ; echo $MSG | ||
! grep -R --include="*.py" --include="*.pyx" -E "(DEPRECATED|DEPRECATE|Deprecated)(:|,|\.)" pandas | ||
RET=$(($RET + $?)) ; echo $MSG "DONE" | ||
|
||
MSG='Check for old-style classes' ; echo $MSG | ||
! grep -R --include="*.py" -E "class\s\S*[^)]:" pandas scripts | ||
RET=$(($RET + $?)) ; echo $MSG "DONE" | ||
|
||
MSG='Check for backticks incorrectly rendering because of missing spaces' ; echo $MSG | ||
! grep -R --include="*.rst" -E "[a-zA-Z0-9]\`\`?[a-zA-Z0-9]" doc/source/ | ||
RET=$(($RET + $?)) ; echo $MSG "DONE" | ||
|
||
MSG='Check for incorrect sphinx directives' ; echo $MSG | ||
! grep -R --include="*.py" --include="*.pyx" --include="*.rst" -E "\.\. (autosummary|contents|currentmodule|deprecated|function|image|important|include|ipython|literalinclude|math|module|note|raw|seealso|toctree|versionadded|versionchanged|warning):[^:]" ./pandas ./doc/source | ||
RET=$(($RET + $?)) ; echo $MSG "DONE" | ||
|
||
MSG='Check for modules that pandas should not import' ; echo $MSG | ||
python -c " | ||
import sys | ||
import pandas | ||
|
||
blacklist = {'bs4', 'gcsfs', 'html5lib', 'ipython', 'jinja2' 'hypothesis', | ||
'lxml', 'numexpr', 'openpyxl', 'py', 'pytest', 's3fs', 'scipy', | ||
'tables', 'xlrd', 'xlsxwriter', 'xlwt'} | ||
mods = blacklist & set(m.split('.')[0] for m in sys.modules) | ||
if mods: | ||
sys.stderr.write('pandas should not import: {}\n'.format(', '.join(mods))) | ||
sys.exit(len(mods)) | ||
" | ||
RET=$(($RET + $?)) ; echo $MSG "DONE" | ||
|
||
fi | ||
|
||
### DOCTESTS ### | ||
if [[ -z "$CHECK" || "$CHECK" == "doctests" ]]; then | ||
|
||
MSG='Doctests frame.py' ; echo $MSG | ||
pytest --doctest-modules -v pandas/core/frame.py \ | ||
-k"-axes -combine -itertuples -join -nlargest -nsmallest -nunique -pivot_table -quantile -query -reindex -reindex_axis -replace -round -set_index -stack -to_dict -to_stata" | ||
RET=$(($RET + $?)) ; echo $MSG "DONE" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I didn't see a response on splitting this file up into 3 separate files, you can make a sub-dir fi that is helpful and have a top-level script to call them. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think I mentioned above. What I did is make the script able to run for a specific section. That keeps the advantage of simplicity, and addresses your concerns of being able to run just part of it in
|
||
|
||
MSG='Doctests series.py' ; echo $MSG | ||
pytest --doctest-modules -v pandas/core/series.py \ | ||
-k"-nonzero -reindex -searchsorted -to_dict" | ||
RET=$(($RET + $?)) ; echo $MSG "DONE" | ||
|
||
MSG='Doctests generic.py' ; echo $MSG | ||
pytest --doctest-modules -v pandas/core/generic.py \ | ||
-k"-_set_axis_name -_xs -describe -droplevel -groupby -interpolate -pct_change -pipe -reindex -reindex_axis -resample -to_json -transpose -values -xs" | ||
RET=$(($RET + $?)) ; echo $MSG "DONE" | ||
|
||
MSG='Doctests top-level reshaping functions' ; echo $MSG | ||
pytest --doctest-modules -v \ | ||
pandas/core/reshape/concat.py \ | ||
pandas/core/reshape/pivot.py \ | ||
pandas/core/reshape/reshape.py \ | ||
pandas/core/reshape/tile.py \ | ||
-k"-crosstab -pivot_table -cut" | ||
RET=$(($RET + $?)) ; echo $MSG "DONE" | ||
|
||
fi | ||
|
||
exit $RET |
This file was deleted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, this seems like a good change to me! Quick question I recently actually enforced E741 in pep8speaks config, #22795. Hence this error will be flagged on any new PRs which is the behaviour we want? I was going to follow up remove any final instances of this in codebase. See #22122
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for now is important to have consistency between the CI and pep8speaks, and among all the different files. I don't have a strong opinion about E741, or the errors discussed in #22122, in general I think the less exceptions we have, the better.