Skip to content

DEPS: Group dependencies and provide information of why they are required #26659

Closed
@datapythonista

Description

@datapythonista

Our dependencies file environment.yml (https://github.com/pandas-dev/pandas/blob/master/environment.yml) is growing, and it's difficult to keep track on what each dependency is required for, and if there is any that can be removed.

We already provide some groups:

  • required: pandas requires them to work
  • development: they are used by pandas contributors and CI, but not by pandas itself
  • optional: pandas can use them if they are installed (e.g. fastparquet will be used if pandas.read_parquet() is called, but it's not required otherwise)

But those don't provide enough information, and could be improved. Also, not sure if the current division is correct, for example nbsphinx should probably be in development, I don't think pandas uses it.

Some groups we could have are:

  • testing (e.g. pytest, hypothesis,...)
  • documentation (e.g. sphinx, numpydoc,...)
  • code checks, see ci/code_checks.sh (e.g. flake8, mypy, isort,...)

And for the optional, I'd personally add a comment for the ones that make sense to inform in which part of pandas they are used, for example:

 - fastparquet>=0.2.1  # pandas.read_parquet, DataFrame.to_parquet
 - matplotlib>=2.2.2   # DataFrame.plot, Series.plot, pandas.plotting

Metadata

Metadata

Assignees

No one assigned

    Labels

    DependenciesRequired and optional dependencies

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions