Skip to content

CI: mypy&pyright in manual pre-commit stage #47075

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
May 22, 2022
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 10 additions & 11 deletions .github/workflows/code-checks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -67,15 +67,6 @@ jobs:
environment-file: ${{ env.ENV_FILE }}
use-only-tar-bz2: true

- name: Install node.js (for pyright)
uses: actions/setup-node@v3
with:
node-version: "16"

- name: Install pyright
# note: keep version in sync with .pre-commit-config.yaml
run: npm install -g [email protected]

- name: Build Pandas
id: build
uses: ./.github/actions/build_pandas
Expand All @@ -96,8 +87,16 @@ jobs:
run: ci/code_checks.sh docstrings
if: ${{ steps.build.outcome == 'success' }}

- name: Run typing validation
run: ci/code_checks.sh typing
- name: Use existing environment for type checking
run: |
echo $PATH >> $GITHUB_PATH
echo "PYTHONHOME=$PYTHONHOME" >> $GITHUB_ENV
echo "PYTHONPATH=$PYTHONPATH" >> $GITHUB_ENV

- name: Typing
uses: pre-commit/[email protected]
with:
extra_args: --hook-stage manual --all-files
if: ${{ steps.build.outcome == 'success' }}

- name: Run docstring validation script tests
Expand Down
18 changes: 16 additions & 2 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
minimum_pre_commit_version: 2.9.2
exclude: ^LICENSES/|\.(html|csv|svg)$
# reserve "manual" for mypy and pyright
default_stages: [commit, merge-commit, push, prepare-commit-msg, commit-msg, post-checkout, post-commit, post-merge, post-rewrite]
ci:
autofix_prs: false
repos:
Expand Down Expand Up @@ -31,7 +33,9 @@ repos:
- id: debug-statements
- id: end-of-file-fixer
exclude: \.txt$
stages: [commit, merge-commit, push, prepare-commit-msg, commit-msg, post-checkout, post-commit, post-merge, post-rewrite]
- id: trailing-whitespace
stages: [commit, merge-commit, push, prepare-commit-msg, commit-msg, post-checkout, post-commit, post-merge, post-rewrite]
- repo: https://github.com/cpplint/cpplint
rev: 1.6.0
hooks:
Expand Down Expand Up @@ -84,12 +88,22 @@ repos:
- id: pyright
name: pyright
entry: pyright
# note: assumes python env is setup and activated
language: node
pass_filenames: false
types: [python]
stages: [manual]
# note: keep version in sync with .github/workflows/code-checks.yml
additional_dependencies: ['[email protected]']
additional_dependencies: ['[email protected]']
- repo: local
hooks:
- id: mypy
name: mypy
entry: mypy
# note: assumes python env is setup and activated
language: system
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it work to use the mypy hook, and add numpy (and whatever else is needed for types) in additional_dependencies? I tried this a while ago and there some were inconsistencies with how mypy runs on the whole project versus as a hook, but if we were able to get it to pass as a hook, then it'd solve a whole bunch of typing headaches

I've seen a bunch of other projects do this, and it seems pretty reliable

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like that! I think the only downside is that the current pre-commit run is very fast as it doesn't need to install all the dependencies from environment.yml. I'm not familiar with pre-commit, is there a nice way of sharing the exact dependency versions (basically everything in environment.yml)?

I guess to get pyright working, we would need to use its venv argument to point it to pre-commit/mypy's environment (I don't think pre-commit allows installing python dependencies for a node program).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it need to be everything in environment.yml? I would've thought that only a few packages would be necessary, like numpy and maybe pytest

is there a nice way of sharing the exact dependency versions (basically everything in environment.yml)?

I don't think so, other than writing our own script (like the one that checks flake8 versions match)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it need to be everything in environment.yml?

Probably almost everything: imported libraries might have type annotations (and pyright explicitly checks that it can find all imported modules).

I think I will just see what a mess it will be :) then we can decide on whether it is worth it or not

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😄 no worries then, system is fine, as long as the stage is manual (it used to not be system, and then there were a tonne of false-positives, and people would tend to just turn off pre-commit)

Copy link
Member Author

@twoertwein twoertwein May 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, let's try system for this PR! Two questions:

  1. Does github action create a new shell when executing pre-commit: the system hook doesn't find mypy from the activated conda env?
  2. It would be great to have only pyright and mypy in manual. By default all hooks are in all stages. Could change default_stages, then we need to call pre-commit only once for mypy+pyright (might reduce some overhead). Do you think that would be helpful? If so, what value should default_stages have?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is actually not that bad listing all dependencies for mypy, see the latest commit.

The issue is to programmatically find the python env that pre-commit created for mypy so that pyright can use it as well.

While it is possible to move mypy and later move pyright, I would really like to make sure that pyright uses the same env as mypy, as pyright errors when it encounters any missing modules. It is easy to forget a module, which is then resolved as Any (can hide errors).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought I found an ugly way that should work on linux but it seems that pyright (and probably mypy too?) needs pandas to be compiled for some of its checks, see the error in pre-commit now.

Compiling pandas in the pre-commit is a little bit too much (more dependencies and it takes much longer). Let's go back to the system+manual approach.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MarcoGorelli I'm mostly pattern matching when it comes to github action+pre-commit but I found a working system+manual version :)

pass_filenames: false
types: [python]
stages: [manual]
- repo: local
hooks:
- id: flake8-rst
Expand Down
23 changes: 2 additions & 21 deletions ci/code_checks.sh
Original file line number Diff line number Diff line change
Expand Up @@ -11,11 +11,10 @@
# $ ./ci/code_checks.sh code # checks on imported code
# $ ./ci/code_checks.sh doctests # run doctests
# $ ./ci/code_checks.sh docstrings # validate docstring errors
# $ ./ci/code_checks.sh typing # run static type analysis
# $ ./ci/code_checks.sh single-docs # check single-page docs build warning-free

[[ -z "$1" || "$1" == "code" || "$1" == "doctests" || "$1" == "docstrings" || "$1" == "typing" || "$1" == "single-docs" ]] || \
{ echo "Unknown command $1. Usage: $0 [code|doctests|docstrings|typing]"; exit 9999; }
[[ -z "$1" || "$1" == "code" || "$1" == "doctests" || "$1" == "docstrings" || "$1" == "single-docs" ]] || \
{ echo "Unknown command $1. Usage: $0 [code|doctests|docstrings]"; exit 9999; }

BASE_DIR="$(dirname $0)/.."
RET=0
Expand Down Expand Up @@ -85,24 +84,6 @@ if [[ -z "$CHECK" || "$CHECK" == "docstrings" ]]; then

fi

### TYPING ###
if [[ -z "$CHECK" || "$CHECK" == "typing" ]]; then

echo "mypy --version"
mypy --version

MSG='Performing static analysis using mypy' ; echo $MSG
mypy
RET=$(($RET + $?)) ; echo $MSG "DONE"

# run pyright, if it is installed
if command -v pyright &> /dev/null ; then
MSG='Performing static analysis using pyright' ; echo $MSG
pyright
RET=$(($RET + $?)) ; echo $MSG "DONE"
fi
fi

### SINGLE-PAGE DOCS ###
if [[ -z "$CHECK" || "$CHECK" == "single-docs" ]]; then
python doc/make.py --warnings-are-errors --single pandas.Series.value_counts
Expand Down
14 changes: 7 additions & 7 deletions doc/source/development/contributing_codebase.rst
Original file line number Diff line number Diff line change
Expand Up @@ -23,19 +23,19 @@ contributing them to the project::

./ci/code_checks.sh

The script validates the doctests, formatting in docstrings, static typing, and
The script validates the doctests, formatting in docstrings, and
imported modules. It is possible to run the checks independently by using the
parameters ``docstring``, ``code``, ``typing``, and ``doctests``
parameters ``docstring``, ``code``, and ``doctests``
(e.g. ``./ci/code_checks.sh doctests``).

In addition, because a lot of people use our library, it is important that we
do not make sudden changes to the code that could have the potential to break
a lot of user code as a result, that is, we need it to be as *backwards compatible*
as possible to avoid mass breakages.

In addition to ``./ci/code_checks.sh``, some extra checks are run by
``pre-commit`` - see :ref:`here <contributing.pre-commit>` for how to
run them.
In addition to ``./ci/code_checks.sh``, some extra checks (including static type
checking) are run by ``pre-commit`` - see :ref:`here <contributing.pre-commit>`
for how to run them.

.. _contributing.pre-commit:

Expand Down Expand Up @@ -260,9 +260,9 @@ pandas uses `mypy <http://mypy-lang.org>`_ and `pyright <https://github.com/micr

.. code-block:: shell

./ci/code_checks.sh typing
pre-commit run --hook-stage manual --all-files

A recent version of ``numpy`` (>=1.21.0) is required for type validation.
in your activated python environment. A recent version of ``numpy`` (>=1.22.0) is required for type validation.

.. _contributing.ci:

Expand Down
2 changes: 1 addition & 1 deletion environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ dependencies:
- flake8-comprehensions=3.7.0 # used by flake8, linting of unnecessary comprehensions
- isort>=5.2.1 # check that imports are in the right order
- mypy=0.950
- pre-commit>=2.9.2
- pre-commit>=2.15.0
- pycodestyle # used by flake8
- pyupgrade

Expand Down
2 changes: 1 addition & 1 deletion requirements-dev.txt
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ flake8-bugbear==21.3.2
flake8-comprehensions==3.7.0
isort>=5.2.1
mypy==0.950
pre-commit>=2.9.2
pre-commit>=2.15.0
pycodestyle
pyupgrade
gitpython
Expand Down