Skip to content

Test Failure with xlrd and defusedxml #27016

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
simonjayhawkins opened this issue Jun 24, 2019 · 8 comments
Closed

Test Failure with xlrd and defusedxml #27016

simonjayhawkins opened this issue Jun 24, 2019 · 8 comments
Labels
Dependencies Required and optional dependencies IO Excel read_excel, to_excel Testing pandas testing functions or related to the test suite Unreliable Test Unit tests that occasionally fail

Comments

@simonjayhawkins
Copy link
Member

================================== FAILURES ===================================
__________________ TestReaders.test_usecols_int[xlrd-.xlsx] ___________________

self = <pandas.tests.io.excel.test_readers.TestReaders object at 0x00000268B4801B70>
read_ext = '.xlsx'
df_ref =                    A         B         C
index
2000-01-03  0.980269  3.685731 -0.36...0-01-07 -0.487094  0.571455 -1.611639
2000-01-10  0.836649  0.246462  0.588543
2000-01-11 -0.157161  1.340307  1.195778

    def test_usecols_int(self, read_ext, df_ref):
        df_ref = df_ref.reindex(columns=["A", "B", "C"])

        # usecols as int
        with tm.assert_produces_warning(FutureWarning,
                                        check_stacklevel=False):
            with ignore_xlrd_time_clock_warning():
                df1 = pd.read_excel("test1" + read_ext, "Sheet1",
>                                   index_col=0, usecols=3)

..\excel\test_readers.py:60:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <contextlib._GeneratorContextManager object at 0x00000268B549AFD0>
type = None, value = None, traceback = None

    def __exit__(self, type, value, traceback):
        if type is None:
            try:
>               next(self.gen)
E               AssertionError: Caused unexpected warning(s): [('PendingDeprecationWarning', PendingDeprecationWarning("This method will be removed in future versions.  Use 'tree.iter()' or 'list(tree.iter())' instead."), 'C:\\Users\\simon\\Anaconda3\\envs\\pandas-dev\\lib\\site-packages\\xlrd\\xlsx.py', 266), ('PendingDeprecationWarning', PendingDeprecationWarning("This method will be removed in future versions.  Use 'tree.iter()' or 'list(tree.iter())' instead."), 'C:\\Users\\simon\\Anaconda3\\envs\\pandas-dev\\lib\\site-packages\\xlrd\\xlsx.py', 312), ('PendingDeprecationWarning', PendingDeprecationWarning("This method will be removed in future versions.  Use 'tree.iter()' or 'list(tree.iter())' instead."), 'C:\\Users\\simon\\Anaconda3\\envs\\pandas-dev\\lib\\site-packages\\xlrd\\xlsx.py', 266)].

C:\Users\simon\Anaconda3\envs\pandas-dev\lib\contextlib.py:119: AssertionError

INSTALLED VERSIONS
------------------
commit           : cf74b0272af2e13e5b9ce40c8bf42df750ddc560
python           : 3.7.3.final.0
python-bits      : 64
OS               : Windows
OS-release       : 10
machine          : AMD64
processor        : Intel64 Family 6 Model 69 Stepping 1, GenuineIntel
byteorder        : little
LC_ALL           : None
LANG             : en_GB.UTF-8
LOCALE           : None.None

pandas           : 0.25.0.dev0+786.gcf74b0272
numpy            : 1.16.4
pytz             : 2019.1
dateutil         : 2.8.0
pip              : 19.1.1
setuptools       : 40.6.3
Cython           : 0.29.10
pytest           : 4.6.2
hypothesis       : 4.23.6
sphinx           : 1.8.5
blosc            : None
feather          : 0.4.0
xlsxwriter       : 1.1.8
lxml.etree       : 4.3.3
html5lib         : 1.0.1
pymysql          : 0.9.3
psycopg2         : None
jinja2           : 2.10.1
IPython          : 7.5.0
pandas_datareader: None
bs4              : 4.7.1
bottleneck       : 1.2.1
fastparquet      : 0.3.0
gcsfs            : None
matplotlib       : 3.1.0
numexpr          : 2.6.9
openpyxl         : 2.6.2
pandas_gbq       : None
pyarrow          : 0.11.1
pytables         : None
s3fs             : 0.2.1
scipy            : 1.2.1
sqlalchemy       : 1.3.4
tables           : 3.5.2
xarray           : 0.12.1
xlrd             : 1.2.0
xlwt             : 1.3.0
xlsxwriter       : 1.1.8
@simonjayhawkins simonjayhawkins added Testing pandas testing functions or related to the test suite IO Excel read_excel, to_excel Windows Windows OS labels Jun 24, 2019
@simonjayhawkins simonjayhawkins added this to the Contributions Welcome milestone Jun 24, 2019
@WillAyd
Copy link
Member

WillAyd commented Jun 24, 2019

Do you have defusedxml installed?

@simonjayhawkins
Copy link
Member Author

Do you have defusedxml installed?

not sure.. have been messing with environment to check some SQL test setup errors.

I am getting this though...

$ conda env update
Collecting package metadata: ...working... done
Solving environment: ...working...
The environment is inconsistent, please check the package plan carefully
The following packages are causing the inconsistency:

  - defaults/win-64::fastparquet==0.3.0=py37h8c2d366_0
  - conda-forge/noarch::feather-format==0.4.0=py_1003
  - defaults/win-64::pyarrow==0.11.1=py37h33f27b4_0
  - defaults/win-64::seaborn==0.9.0=py37_0
  - defaults/win-64::statsmodels==0.9.0=py37h452e1ab_0
  - defaults/noarch::xarray==0.12.1=py_0
done

will start again with a clean environment and check.

@simonjayhawkins
Copy link
Member Author

clean environment.. still fails. defusedxml is installed.

# packages in environment at C:\Users\simon\Anaconda3\envs\pandas-dev:
#
# Name                    Version                   Build  Channel
defusedxml                0.6.0                      py_0
(pandas-dev)

@WillAyd
Copy link
Member

WillAyd commented Jun 24, 2019

Good to know. This might be related to #26657 (comment) (cc @datapythonista) and just some general buggery in xlrd (https://github.com/python-excel/xlrd/issues/315#issuecomment-493715073) that won't get fixed.

I think we should just catch these in the tests

@datapythonista
Copy link
Member

Is this not failing in the CI because we don't have a test build with environment.yml?

Didn't know about defusedxml but seems like a good package to use. And a bit worrying the conversation in the xlrd issue, but I guess there is not much we can do. I guess updating the tests so the warning doesn't cause a fail is the only option, as @WillAyd suggests.

@WillAyd
Copy link
Member

WillAyd commented Jun 24, 2019

Is this not failing in the CI because we don't have a test build with environment.yml?

Yea that's right. Would need to be the combination of xlrd and defusedxml which I don't think appears in CI

@simonjayhawkins simonjayhawkins added the Dependencies Required and optional dependencies label Jun 24, 2019
@WillAyd WillAyd removed the Windows Windows OS label Jun 24, 2019
@WillAyd WillAyd changed the title local test failure on Windows with xlrd 1.2.0 Test Failure with xlrd and defusedxml Jun 24, 2019
@jbrockmendel jbrockmendel added the Unreliable Test Unit tests that occasionally fail label Dec 25, 2019
@jbrockmendel
Copy link
Member

any chance this has resolved itself in the interim?

@mroeschke
Copy link
Member

This looks okay when I run this locally

% conda list defusedxml
# packages in environment at /Users/matthewroeschke/opt/miniconda3/envs/pandas-dev:
#
# Name                    Version                   Build  Channel
defusedxml                0.7.1              pyhd8ed1ab_0    conda-forge
% pytest pandas/tests/io/excel/test_readers.py::TestReaders::test_usecols_int
=================================================================== test session starts ====================================================================
platform darwin -- Python 3.8.12, pytest-7.1.2, pluggy-1.0.0
rootdir: , configfile: pyproject.toml
plugins: cython-0.2.0, asyncio-0.18.3, xdist-2.5.0, leaks-0.3.1, forked-1.4.0, hypothesis-6.46.9, instafail-0.4.1, cov-3.0.0
asyncio: mode=strict
collected 8 items

pandas/tests/io/excel/test_readers.py ......s.

=================================================================== slowest 30 durations ===================================================================
0.07s call     pandas/tests/io/excel/test_readers.py::TestReaders::test_usecols_int[engine_and_read_ext7]
0.02s call     pandas/tests/io/excel/test_readers.py::TestReaders::test_usecols_int[engine_and_read_ext4]
0.02s call     pandas/tests/io/excel/test_readers.py::TestReaders::test_usecols_int[engine_and_read_ext2]
0.02s call     pandas/tests/io/excel/test_readers.py::TestReaders::test_usecols_int[engine_and_read_ext5]
0.02s call     pandas/tests/io/excel/test_readers.py::TestReaders::test_usecols_int[engine_and_read_ext1]
0.01s call     pandas/tests/io/excel/test_readers.py::TestReaders::test_usecols_int[engine_and_read_ext0]
0.01s call     pandas/tests/io/excel/test_readers.py::TestReaders::test_usecols_int[engine_and_read_ext3]

(16 durations < 0.005s hidden.  Use -vv to show these durations.)
=============================================================== 7 passed, 1 skipped in 1.30s ===============================================================

So closing since I can't reproduce and lack of activity, but happy to reopen if it appears again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Dependencies Required and optional dependencies IO Excel read_excel, to_excel Testing pandas testing functions or related to the test suite Unreliable Test Unit tests that occasionally fail
Projects
None yet
Development

No branches or pull requests

5 participants