Closed
Description
Our dependencies file environment.yml
(https://github.com/pandas-dev/pandas/blob/master/environment.yml) is growing, and it's difficult to keep track on what each dependency is required for, and if there is any that can be removed.
We already provide some groups:
- required: pandas requires them to work
- development: they are used by pandas contributors and CI, but not by pandas itself
- optional: pandas can use them if they are installed (e.g.
fastparquet
will be used ifpandas.read_parquet()
is called, but it's not required otherwise)
But those don't provide enough information, and could be improved. Also, not sure if the current division is correct, for example nbsphinx
should probably be in development
, I don't think pandas uses it.
Some groups we could have are:
- testing (e.g.
pytest
,hypothesis
,...) - documentation (e.g.
sphinx
,numpydoc
,...) - code checks, see
ci/code_checks.sh
(e.g.flake8
,mypy
,isort
,...)
And for the optional, I'd personally add a comment for the ones that make sense to inform in which part of pandas they are used, for example:
- fastparquet>=0.2.1 # pandas.read_parquet, DataFrame.to_parquet
- matplotlib>=2.2.2 # DataFrame.plot, Series.plot, pandas.plotting