Skip to content

DEPS: Sync environment.yml with CI dep files #47287

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 20 commits into from
Jun 22, 2022
Merged
Show file tree
Hide file tree
Changes from 11 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 29 additions & 0 deletions .github/workflows/code-checks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -169,3 +169,32 @@ jobs:

- name: Build image
run: docker build --pull --no-cache --tag pandas-dev-env .

requirements-dev-text-installable:
name: Test install requirements-dev.txt
runs-on: ubuntu-latest

concurrency:
# https://github.community/t/concurrecy-not-work-for-push/183068/7
group: ${{ github.event_name == 'push' && github.run_number || github.ref }}-requirements-dev-text-installable
cancel-in-progress: true

steps:
- name: Checkout
uses: actions/checkout@v3
with:
fetch-depth: 0

- name: Setup Python
id: setup_python
uses: actions/setup-python@v3
with:
python-version: '3.8'
cache: 'pip'
cache-dependency-path: 'requirements-dev.txt'

- name: Install requirements-dev.txt
run: pip install -r requirements-dev.txt

- name: Check Pip Cache Hit
run: echo ${{ steps.setup_python.outputs.cache-hit }}
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
name: Posix
name: Ubuntu

on:
push:
Expand Down Expand Up @@ -145,7 +145,7 @@ jobs:

- name: Extra installs
# xsel for clipboard tests
run: sudo apt-get update && sudo apt-get install -y libc6-dev-i386 xsel ${{ env.EXTRA_APT }}
run: sudo apt-get update && sudo apt-get install -y xsel ${{ env.EXTRA_APT }}

- uses: conda-incubator/[email protected]
with:
Expand Down
3 changes: 1 addition & 2 deletions ci/deps/actions-310.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -31,8 +31,7 @@ dependencies:
- jinja2
- lxml
- matplotlib
# TODO: uncomment after numba supports py310
#- numba
- numba
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could cause installing an older numpy version which could(?) explain most of the errors (but not the pyqt stuff).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As there is one dedicated CI run for numpy-dev, it would make sense to use the latest numpy compatible with numba (even for typing). Reverting #45244 would probably fix most of the numpy errors.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file should only affect our specific PY 3.10 build which just runs the unit tests though. The typing checks should have an environment that is set up by environment.yml

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I meant that since we anyways run the unit tests with NumPy-dev in a separate workflow, prioritizing the latest numba version over the latest (released) NumPy version (in environment.yml) could be fine. Either way, it would be good to limit the numba version or the numpy version in environment.yml.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As there is one dedicated CI run for numpy-dev, it would make sense to use the latest numpy compatible with numba (even for typing). Reverting #45244 would probably fix most of the numpy errors.

The problem I have with this is that new contribritors when setting up an environment will get the latest numpy and have mypy errors by default.

We should make the contributor experience pain free so (imo) we should use environment.yaml for the typing validation to match the local dev env .

Otherwise, this just makes it difficult for people to contribute to the typing issues.

Now, numba is included in environment.yaml so I'm not sure why when I set up a clean dev locally I get numpy 1.23.1 and on ci we get 1.22.4 (maybe there is some caching on ci?)

My comments here are from looking into this a couple of weeks ago. So this comment here now maybe out of date. Will look again soon.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now, numba is included in environment.yaml so I'm not sure why when I set up a clean dev locally I get numpy 1.23.1 and on ci we get 1.22.4 (maybe there is some caching on ci?)

I must admit that I don't use the official way to setup a pandas-dev env, but it would be great to ensure that the officially documented pandas-env does not cause mypy errors.

Maybe numba has different numpy-constraints on conda-forge (or conda installs incompatible versions)? When I ask poetry to install numba = ">=0.53.1" (as in environment.yml) and numpy = ">=1.23.0", it is unable to find a solution (at least not on Linux with python 3.10).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I must admit that I don't use the official way to setup a pandas-dev env, but it would be great to ensure that the officially documented pandas-env does not cause mypy errors.

yes I need to double check that's still true.

- numexpr
- openpyxl
- odfpy
Expand Down
152 changes: 77 additions & 75 deletions environment.yml
Original file line number Diff line number Diff line change
@@ -1,21 +1,85 @@
# Local development dependencies including docs building, website upload, ASV benchmark
name: pandas-dev
channels:
- conda-forge
dependencies:
# required
- numpy>=1.19.5
- python=3.8
- python-dateutil>=2.8.1

# test dependencies
- cython=0.29.30
- pytest>=6.0
- pytest-cov
- pytest-xdist>=1.31
- psutil
- pytest-asyncio>=0.17
- boto3

# required dependencies
- python-dateutil
- numpy
- pytz

# optional dependencies
- beautifulsoup4
- blosc
- brotlipy
- bottleneck
- fastparquet
- fsspec
- html5lib
- hypothesis
- gcsfs
- jinja2
- lxml
- matplotlib
- numba>=0.53.1
- numexpr>=2.8.0 # pin for "Run checks on imported code" job
- openpyxl
- odfpy
- pandas-gbq
- psycopg2
- pyarrow
- pymysql
- pyreadstat
- pytables
- python-snappy
- pyxlsb
- s3fs
- scipy
- sqlalchemy
- tabulate
- xarray
- xlrd
- xlsxwriter
- xlwt
- zstandard

# downstream packages
- aiobotocore<2.0.0 # GH#44311 pinned to fix docbuild
- botocore
- cftime
- dask
- ipython
- geopandas-base
- seaborn
- scikit-learn
- statsmodels
- coverage
- pandas-datareader
- pyyaml
- py
- pytorch

# local testing dependencies
- moto
- flask

# benchmarks
- asv

# building
# The compiler packages are meta-packages and install the correct compiler (activation) packages on the respective platforms.
- c-compiler
- cxx-compiler
- cython>=0.29.30

# code checks
- black=22.3.0
Expand All @@ -32,10 +96,11 @@ dependencies:
# documentation
- gitpython # obtain contributors from git for whatsnew
- gitdb
- natsort # DataFrame.sort_values doctest
- numpydoc
- pandas-dev-flaker=0.5.0
- pydata-sphinx-theme=0.8.0
- pytest-cython
- pytest-cython # doctest
- sphinx
- sphinx-panels
- types-python-dateutil
Expand All @@ -47,77 +112,14 @@ dependencies:
- nbconvert>=6.4.5
- nbsphinx
- pandoc

# Dask and its dependencies (that dont install with dask)
- dask-core
- toolz>=0.7.3
- partd>=0.3.10
- cloudpickle>=0.2.1

# web (jinja2 is also needed, but it's also an optional pandas dependency)
- markdown
- feedparser
- pyyaml
- requests

# testing
- boto3
- botocore>=1.11
- hypothesis>=5.5.3
- moto # mock S3
- flask
- pytest>=6.0
- pytest-cov
- pytest-xdist>=1.31
- pytest-asyncio>=0.17
- pytest-instafail

# downstream tests
- seaborn
- statsmodels

# unused (required indirectly may be?)
- ipywidgets
- nbformat
- notebook>=6.0.3

# optional
- blosc
- bottleneck>=1.3.1
- ipykernel
- ipython>=7.11.1
- jinja2 # pandas.Styler
- matplotlib>=3.3.2 # pandas.plotting, Series.plot, DataFrame.plot
- numexpr>=2.7.1
- scipy>=1.4.1
- numba>=0.50.1

# optional for io
# ---------------
# pd.read_html
- beautifulsoup4>=4.8.2
- html5lib
- lxml

# pd.read_excel, DataFrame.to_excel, pd.ExcelWriter, pd.ExcelFile
- openpyxl
- xlrd
- xlsxwriter
- xlwt
- odfpy

- fastparquet>=0.4.0 # pandas.read_parquet, DataFrame.to_parquet
- pyarrow>2.0.1 # pandas.read_parquet, DataFrame.to_parquet, pandas.read_feather, DataFrame.to_feather
- python-snappy # required by pyarrow

- pytables>=3.6.1 # pandas.read_hdf, DataFrame.to_hdf
- s3fs>=0.4.0 # file IO when using 's3://...' path
- aiobotocore<2.0.0 # GH#44311 pinned to fix docbuild
- fsspec>=0.7.4 # for generic remote file operations
- gcsfs>=0.6.0 # file IO when using 'gcs://...' path
- sqlalchemy # pandas.read_sql, DataFrame.to_sql
- xarray # DataFrame.to_xarray
- cftime # Needed for downstream xarray.CFTimeIndex test
- pyreadstat # pandas.read_spss
- tabulate>=0.8.3 # DataFrame.to_markdown
- natsort # DataFrame.sort_values
# web
- jinja2 # in optional dependencies, but documented here as needed
- markdown
- feedparser
- pyyaml
- requests
115 changes: 63 additions & 52 deletions requirements-dev.txt
Original file line number Diff line number Diff line change
@@ -1,11 +1,66 @@
# This file is auto-generated from environment.yml, do not modify.
# See that file for comments about the need/usage of each dependency.

numpy>=1.19.5
python-dateutil>=2.8.1
cython==0.29.30
pytest>=6.0
pytest-cov
pytest-xdist>=1.31
psutil
pytest-asyncio>=0.17
boto3
python-dateutil
numpy
pytz
beautifulsoup4
blosc
brotlipy
bottleneck
fastparquet
fsspec
html5lib
hypothesis
gcsfs
jinja2
lxml
matplotlib
numba>=0.53.1
numexpr>=2.8.0
openpyxl
odfpy
pandas-gbq
psycopg2
pyarrow
pymysql
pyreadstat
tables
python-snappy
pyxlsb
s3fs
scipy
sqlalchemy
tabulate
xarray
xlrd
xlsxwriter
xlwt
zstandard
aiobotocore<2.0.0
botocore
cftime
dask
ipython
geopandas
seaborn
scikit-learn
statsmodels
coverage
pandas-datareader
pyyaml
py
torch
moto
flask
asv
cython>=0.29.30
black==22.3.0
cpplint
flake8==4.0.1
Expand All @@ -18,6 +73,7 @@ pycodestyle
pyupgrade
gitpython
gitdb
natsort
numpydoc
pandas-dev-flaker==0.5.0
pydata-sphinx-theme==0.8.0
Expand All @@ -31,58 +87,13 @@ types-setuptools
nbconvert>=6.4.5
nbsphinx
pandoc
dask
toolz>=0.7.3
partd>=0.3.10
cloudpickle>=0.2.1
markdown
feedparser
pyyaml
requests
boto3
botocore>=1.11
hypothesis>=5.5.3
moto
flask
pytest>=6.0
pytest-cov
pytest-xdist>=1.31
pytest-asyncio>=0.17
pytest-instafail
seaborn
statsmodels
ipywidgets
nbformat
notebook>=6.0.3
blosc
bottleneck>=1.3.1
ipykernel
ipython>=7.11.1
jinja2
matplotlib>=3.3.2
numexpr>=2.7.1
scipy>=1.4.1
numba>=0.50.1
beautifulsoup4>=4.8.2
html5lib
lxml
openpyxl
xlrd
xlsxwriter
xlwt
odfpy
fastparquet>=0.4.0
pyarrow>2.0.1
python-snappy
tables>=3.6.1
s3fs>=0.4.0
aiobotocore<2.0.0
fsspec>=0.7.4
gcsfs>=0.6.0
sqlalchemy
xarray
cftime
pyreadstat
tabulate>=0.8.3
natsort
markdown
feedparser
pyyaml
requests
setuptools>=51.0.0
2 changes: 1 addition & 1 deletion scripts/generate_pip_deps_from_conda.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
import yaml

EXCLUDE = {"python", "c-compiler", "cxx-compiler"}
RENAME = {"pytables": "tables", "dask-core": "dask"}
RENAME = {"pytables": "tables", "geopandas-base": "geopandas", "pytorch": "torch"}


def conda_package_to_pip(package: str):
Expand Down