Skip to content

TST: error in parquet tests on windows with new pyarrow exception #45344

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jreback opened this issue Jan 13, 2022 · 3 comments · Fixed by #45364
Closed

TST: error in parquet tests on windows with new pyarrow exception #45344

jreback opened this issue Jan 13, 2022 · 3 comments · Fixed by #45364
Labels
IO Parquet parquet, feather Testing pandas testing functions or related to the test suite
Milestone

Comments

@jreback
Copy link
Contributor

jreback commented Jan 13, 2022

see https://dev.azure.com/pandas-dev/pandas/_build/results?buildId=72195&view=logs&j=9dd5e044-6afa-5be6-cb56-860f1ee82b31&t=d3ae057b-d0af-59ce-6e4c-b69ef3739701

================================== FAILURES ===================================
__________________________ test_arrowparquet_options __________________________
[gw1] win32 -- Python 3.8.12 C:\Miniconda\envs\pandas-dev\python.exe

fsspectest = <pandas.conftest.fsspectest.<locals>.TestMemoryFS object at 0x00000236F72FED90>

    @td.skip_if_no("pyarrow")
    def test_arrowparquet_options(fsspectest):
        """Regression test for writing to a not-yet-existent GCS Parquet file."""
        df = DataFrame({"a": [0]})
>       df.to_parquet(
            "testmem://test/test.csv",
            engine="pyarrow",
            compression=None,
            storage_options={"test": "parquet_write"},
        )

pandas\tests\io\test_fsspec.py:167: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
pandas\util\_decorators.py:207: in wrapper
    return func(*args, **kwargs)
pandas\core\frame.py:2842: in to_parquet
    return to_parquet(
pandas\io\parquet.py:420: in to_parquet
    impl.write(
pandas\io\parquet.py:195: in write
    self.api.parquet.write_table(
C:\Miniconda\envs\pandas-dev\lib\site-packages\pyarrow\parquet.py:1673: in write_table
    with ParquetWriter(
C:\Miniconda\envs\pandas-dev\lib\site-packages\pyarrow\parquet.py:544: in __init__
    filesystem, path = resolve_filesystem_and_path(where, filesystem)
C:\Miniconda\envs\pandas-dev\lib\site-packages\pyarrow\filesystem.py:455: in resolve_filesystem_and_path
    return _ensure_filesystem(filesystem), path
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

fs = <pandas.conftest.fsspectest.<locals>.TestMemoryFS object at 0x00000236F72CAE80>

@jreback jreback added Testing pandas testing functions or related to the test suite IO Parquet parquet, feather labels Jan 13, 2022
@jreback jreback added this to the 1.5 milestone Jan 13, 2022
@jorisvandenbossche
Copy link
Member

So this started failing around the master/main rename commit. Not that this is related, but checking the last working build on main vs that first failing build, and comparing the environments: fsspec was updated from 2021.11.1 to 2022.1.0 (and in addition also botocore and s3fs updated.

So this is probably a change in fsspec that causes pyarrow to no longer recognize the filesystem.

@jorisvandenbossche
Copy link
Member

cc @martindurant

@jorisvandenbossche
Copy link
Member

OK, checking fsspec/filesystem_spec#880 and the discussion in fsspec/filesystem_spec#879 (comment), this should basically only start failing with pyarrow < 2, and indeed this specific test build is using pyarrow 1.0.1.

So summary: if we want to test an old pyarrow, we should also pin fsspec to an older version, or update the pyarrow version in that build:

- pyarrow=1.0.1

- fsspec>=0.8.0

I would propose that we bump the pin in that file for pyarrow to at least pyarrow=2 (we still have other builds with pyarrow 1.0).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO Parquet parquet, feather Testing pandas testing functions or related to the test suite
Projects
None yet
2 participants