Skip to content

BUG: pd.test() fails on Python v3.9.6 #46498

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
2 of 3 tasks
SomervilleTom opened this issue Mar 24, 2022 · 8 comments · Fixed by #46510
Closed
2 of 3 tasks

BUG: pd.test() fails on Python v3.9.6 #46498

SomervilleTom opened this issue Mar 24, 2022 · 8 comments · Fixed by #46510
Labels
Bug CI Continuous Integration IO Network Local or Cloud (AWS, GCS, etc.) IO Issues Testing pandas testing functions or related to the test suite Unreliable Test Unit tests that occasionally fail
Milestone

Comments

@SomervilleTom
Copy link

SomervilleTom commented Mar 24, 2022

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

>>> import pandas as pd
>>> pd.test()

Issue Description

>>> import pandas as pd
>>> pd.test()
running: pytest --skip-slow --skip-network --skip-db /home/tms/.local/lib/python3.9/site-packages/pandas
======================================== test session starts =========================================
platform linux -- Python 3.9.6, pytest-7.1.1, pluggy-1.0.0
rootdir: /home/tms
plugins: hypothesis-6.39.4
collected 156958 items / 37 skipped                                                                  
# ...
= 3 failed, 146978 passed, 8670 skipped, 1318 xfailed, 3 xpassed, 169 warnings, 23 errors in 997.19s (0:16:37) =

Note the several failures, warnings and errors as well as the extended (997.19s) time.

Expected Behavior

I'm a reasonably proficient Python developer attempting to evaluate whether pandas can solve a specific data imputation issue that has come up in a project. I'm therefore new to pandas, while not new to Python or the rest of my toolchain.

Surely a stable distribution of any package should pass its own documented tests! I see so many failures that I'm left doubting whether pandas is running at all. Since I am new to pandas, I have no way of knowing which if any of these failures, errors, and warnings are significant and which are not.

I have installed Pandas using pip3 on a Rocky Linux v8.5 (CentOS 8) system running python 3.9. I did the installation using pip3 and following the instructions in the pandas documentation.

I expect any standard "stable" version to pass most of its published tests. In the pandas documentation, the expected results are:

running: pytest --skip-slow --skip-network C:\Users\TP\Anaconda3\envs\py36\lib\site-packages\pandas
============================= test session starts =============================
platform win32 -- Python 3.6.2, pytest-3.6.0, py-1.4.34, pluggy-0.4.0
rootdir: C:\Users\TP\Documents\Python\pandasdev\pandas, inifile: setup.cfg
collected 12145 items / 3 skipped

..................................................................S......
........S................................................................
.........................................................................

==================== 12130 passed, 12 skipped in 368.339 seconds =====================

Significantly, I expect to have 0 failures, and 0 errors. I'm not sure about "xfailed", "xpassed", and "warnings".

I have tried using pip3 to install various combinations of the 30+ dependencies mentioned in the documentation. Seven of those cause pip3 to fail, and so I skipped them.

SQLAlchemy
psycopg2
PyTables
zlib
PyQt4
xclip
xsel

By the time I installed all of the installable dependencies, the test results worsened:

= 4 failed, 148951 passed, 7431 skipped, 1330 xfailed, 3 xpassed, 184 warnings, 48 errors in 1081.06s (0:18:01) =

Installed Versions

INSTALLED VERSIONS ------------------ commit : 06d2301 python : 3.9.6.final.0 python-bits : 64 OS : Linux OS-release : 4.18.0-348.12.2.el8_5.x86_64 Version : #1 SMP Wed Jan 19 17:53:40 UTC 2022 machine : x86_64 processor : x86_64 byteorder : little LC_ALL : None LANG : en_US.UTF-8 LOCALE : en_US.UTF-8

pandas : 1.4.1
numpy : 1.22.3
pytz : 2021.3
dateutil : 2.8.2
pip : 20.2.4
setuptools : 50.3.2
Cython : None
pytest : 7.1.1
hypothesis : 6.39.4
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : None
IPython : None
pandas_datareader: None
bs4 : None
bottleneck : 1.3.4
fastparquet : None
fsspec : None
gcsfs : None
matplotlib : None
numba : None
numexpr : 2.8.1
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : 7.0.0
pyreadstat : None
pyxlsb : None
s3fs : None
scipy : 1.8.0
sqlalchemy : None
tables : None
tabulate : None
xarray : None
xlrd : None
xlwt : None
zstandard : None

@SomervilleTom SomervilleTom added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels Mar 24, 2022
@jreback
Copy link
Contributor

jreback commented Mar 25, 2022

you are welcome to have a look and report specific bugs

please read the developer documentation

@SomervilleTom
Copy link
Author

SomervilleTom commented Mar 25, 2022

I'm disappointed that I have to read developer documentation in order for a simple install to pass the tests describes in its own installation page.

Can you perhaps be more specific about the apparent issue?

@jreback
Copy link
Contributor

jreback commented Mar 25, 2022

you are welcome to report a bug
but keep in mind this is an all volunteer project and we have 150k tests that are run on every commit

if your platform / install doesn't work then u can report a specific issue

@SomervilleTom
Copy link
Author

SomervilleTom commented Mar 25, 2022

I appreciate your efforts and attention.

I've been doing test-driven development for as long as the concept has existed, I understand the challenges. If those 150k tests passed on the commit for v1.4.1, then there must be a difference between the configuration on which they passed and mine.

I've shown the result of my attempt to run what I think is the same test suite that is described in the documentation -- yet the header in the test run presented in the documentation says collected 12145 items / 3 skipped, while I see collected 156958 items / 37 skipped.

Why is my suite collecting more than TEN TIMES as many items as the documentation (156958 vs 12145)? Why do I see "37 skipped" items as opposed to "3 skipped" in the documentation? Why does the header in mine mention the hypothesis-6.39.4 while that plugin is not mentioned in the documentation? It appears that the documentation shows a test that ran under Python v3.6.2 while I am running v3.9.6 -- does the commit testing include confirming that the tests pass under multiple versions of Python?.

I am under the impression that this exchange IS how I report a specific issue. I'm attempting to offer as much information as I can in hopes that your volunteer team -- who are far more familiar with this package then me -- might recognize some obvious explanation(s). It seems to me that the difference between the test results I see on my system and the test results offered in the documentation is specific enough that it is worth investigating. All the testing in the world doesn't help if well-documented reports of test failures are rejected as not specific enough.

I've provided the information requested below. I'm happy to provide more information as needed.

@SomervilleTom SomervilleTom changed the title BUG: BUG: Panda fails unit tests on Python v3.9.6 Mar 25, 2022
@SomervilleTom SomervilleTom changed the title BUG: Panda fails unit tests on Python v3.9.6 BUG: Panda fails tests on Python v3.9.6 Mar 25, 2022
@SomervilleTom SomervilleTom changed the title BUG: Panda fails tests on Python v3.9.6 BUG: pd.test() fails on Python v3.9.6 Mar 25, 2022
@mzeitlin11
Copy link
Member

Why is my suite collecting more than TEN TIMES as many items as the documentation (156958 vs 12145)? Why do I see "37 skipped" items as opposed to "3 skipped" in the documentation? Why does the header in mine mention the hypothesis-6.39.4 while that plugin is not mentioned in the documentation? It appears that the documentation shows a test that ran under Python v3.6.2 while I am running v3.9.6 -- does the commit testing include confirming that the tests pass under multiple versions of Python?.

That piece of documentation must just be out of date, will open an issue about it. It would be great if the test suite could run that quickly :)

I've provided the information requested below. I'm happy to provide more information as needed.

It would be helpful to know which specific tests failed (and if those individual tests fail when run in isolation with pytest) to figure out why you're seeing failures the CI isn't

@mzeitlin11 mzeitlin11 added Testing pandas testing functions or related to the test suite CI Continuous Integration and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Mar 25, 2022
@SomervilleTom
Copy link
Author

SomervilleTom commented Mar 25, 2022

It would be helpful to know which specific tests failed (and if those individual tests fail when run in isolation with pytest) to figure out why you're seeing failures the CI isn't

I've attached a session log from earlier today, showing the complete SSH Client session. I hope this helps!

This is a shell connected to an AWS EC2 t3.xlarge instance running Rocky Linux v8.5. There are ample CPU, storage, and memory resources available.

failure_2022_03251155.log

@mzeitlin11
Copy link
Member

Thanks! Hmm, except for the clipboard tests, errors all look like

  @pytest.fixture(scope="session")
  def s3_base(worker_id):
E       fixture 'worker_id' not found

with the failing tests as


FAILED .local/lib/python3.9/site-packages/pandas/tests/io/test_clipboard.py::TestClipboard::test_raw_roundtrip[\U0001f44d...]
FAILED .local/lib/python3.9/site-packages/pandas/tests/io/test_clipboard.py::TestClipboard::test_raw_roundtrip[\u03a9\u0153\u2211\xb4...]
FAILED .local/lib/python3.9/site-packages/pandas/tests/io/test_clipboard.py::TestClipboard::test_raw_roundtrip[abcd...]
ERROR .local/lib/python3.9/site-packages/pandas/tests/io/test_parquet.py::TestParquetPyArrow::test_s3_roundtrip_explicit_fs
ERROR .local/lib/python3.9/site-packages/pandas/tests/io/test_parquet.py::TestParquetPyArrow::test_s3_roundtrip
ERROR .local/lib/python3.9/site-packages/pandas/tests/io/test_parquet.py::TestParquetFastParquet::test_s3_roundtrip
ERROR .local/lib/python3.9/site-packages/pandas/tests/io/json/test_compression.py::test_with_s3_url[None]
ERROR .local/lib/python3.9/site-packages/pandas/tests/io/json/test_compression.py::test_with_s3_url[gzip]
ERROR .local/lib/python3.9/site-packages/pandas/tests/io/json/test_compression.py::test_with_s3_url[bz2]
ERROR .local/lib/python3.9/site-packages/pandas/tests/io/json/test_compression.py::test_with_s3_url[zip]
ERROR .local/lib/python3.9/site-packages/pandas/tests/io/json/test_compression.py::test_with_s3_url[xz]
ERROR .local/lib/python3.9/site-packages/pandas/tests/io/json/test_pandas.py::TestPandasContainer::test_read_s3_jsonl
ERROR .local/lib/python3.9/site-packages/pandas/tests/io/json/test_pandas.py::TestPandasContainer::test_to_s3
ERROR .local/lib/python3.9/site-packages/pandas/tests/io/parser/test_network.py::TestS3::test_parse_public_s3n_bucket
ERROR .local/lib/python3.9/site-packages/pandas/tests/io/parser/test_network.py::TestS3::test_parse_public_s3a_bucket
ERROR .local/lib/python3.9/site-packages/pandas/tests/io/parser/test_network.py::TestS3::test_parse_public_s3_bucket_nrows
ERROR .local/lib/python3.9/site-packages/pandas/tests/io/parser/test_network.py::TestS3::test_parse_public_s3_bucket_chunked
ERROR .local/lib/python3.9/site-packages/pandas/tests/io/parser/test_network.py::TestS3::test_parse_public_s3_bucket_chunked_python
ERROR .local/lib/python3.9/site-packages/pandas/tests/io/parser/test_network.py::TestS3::test_parse_public_s3_bucket_python
ERROR .local/lib/python3.9/site-packages/pandas/tests/io/parser/test_network.py::TestS3::test_infer_s3_compression
ERROR .local/lib/python3.9/site-packages/pandas/tests/io/parser/test_network.py::TestS3::test_parse_public_s3_bucket_nrows_python
ERROR .local/lib/python3.9/site-packages/pandas/tests/io/parser/test_network.py::TestS3::test_read_s3_fails
ERROR .local/lib/python3.9/site-packages/pandas/tests/io/parser/test_network.py::TestS3::test_read_csv_handles_boto_s3_object
ERROR .local/lib/python3.9/site-packages/pandas/tests/io/parser/test_network.py::TestS3::test_read_csv_chunked_download
ERROR .local/lib/python3.9/site-packages/pandas/tests/io/parser/test_network.py::TestS3::test_read_s3_with_hash_in_key
ERROR .local/lib/python3.9/site-packages/pandas/tests/io/parser/test_network.py::TestS3::test_read_feather_s3_file_path

@mzeitlin11 mzeitlin11 added IO Network Local or Cloud (AWS, GCS, etc.) IO Issues Unreliable Test Unit tests that occasionally fail labels Mar 25, 2022
@jreback jreback added this to the 1.5 milestone Mar 25, 2022
@SomervilleTom
Copy link
Author

I don't see any changes in the documentation. Can you clarify where I can find the correct information about how to run the test suite?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug CI Continuous Integration IO Network Local or Cloud (AWS, GCS, etc.) IO Issues Testing pandas testing functions or related to the test suite Unreliable Test Unit tests that occasionally fail
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants