Skip to content

CI: fsspec 2021.6.0 failures #42026

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mzeitlin11 opened this issue Jun 15, 2021 · 1 comment · Fixed by #43849
Closed

CI: fsspec 2021.6.0 failures #42026

mzeitlin11 opened this issue Jun 15, 2021 · 1 comment · Fixed by #43849
Labels
CI Continuous Integration Dependencies Required and optional dependencies IO Data IO issues that don't fit into a more specific label
Milestone

Comments

@mzeitlin11
Copy link
Member

mzeitlin11 commented Jun 15, 2021

pandas/tests/io/test_fsspec.py::test_read_csv and pandas/tests/io/test_fsspec.py::test_markdown_options fails with fsspec version 2021.6.0.

Based on the changelog, the culprit might be

Better testing and folder handling for Memory (654)

but more investigation needed to see if this is a new bug or something we should fix in the failing tests

Temporary fix in #42023

@mzeitlin11 mzeitlin11 added CI Continuous Integration Dependencies Required and optional dependencies IO Data IO issues that don't fit into a more specific label labels Jun 15, 2021
@mzeitlin11 mzeitlin11 added this to the Contributions Welcome milestone Jun 15, 2021
@jorisvandenbossche
Copy link
Member

For reference, the failures (indeed related to MemoryFileSystem):

=================================== FAILURES ===================================
________________________________ test_read_csv _________________________________
[gw1] linux -- Python 3.7.10 /usr/share/miniconda/envs/pandas-dev/bin/python

cleared_fs = <fsspec.implementations.memory.MemoryFileSystem object at 0x7f7e66ebff90>

    def test_read_csv(cleared_fs):
        from fsspec.implementations.memory import MemoryFile
    
        cleared_fs.store["test/test.csv"] = MemoryFile(data=text)
>       df2 = read_csv("memory://test/test.csv", parse_dates=["dt"])

pandas/tests/io/test_fsspec.py:44: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
pandas/util/_decorators.py:311: in wrapper
    return func(*args, **kwargs)
pandas/io/parsers/readers.py:586: in read_csv
    return _read(filepath_or_buffer, kwds)
pandas/io/parsers/readers.py:482: in _read
    parser = TextFileReader(filepath_or_buffer, **kwds)
pandas/io/parsers/readers.py:811: in __init__
    self._engine = self._make_engine(self.engine)
pandas/io/parsers/readers.py:1040: in _make_engine
    return mapping[engine](self.f, **self.options)  # type: ignore[call-arg]
pandas/io/parsers/c_parser_wrapper.py:51: in __init__
    self._open_handles(src, kwds)
pandas/io/parsers/base_parser.py:228: in _open_handles
    errors=kwds.get("encoding_errors", "strict"),
pandas/io/common.py:613: in get_handle
    storage_options=storage_options,
pandas/io/common.py:358: in _get_filepath_or_buffer
    filepath_or_buffer, mode=fsspec_mode, **(storage_options or {})
/usr/share/miniconda/envs/pandas-dev/lib/python3.7/site-packages/fsspec/core.py:135: in open
    out = self.__enter__()
/usr/share/miniconda/envs/pandas-dev/lib/python3.7/site-packages/fsspec/core.py:102: in __enter__
    f = self.fs.open(self.path, mode=mode)
/usr/share/miniconda/envs/pandas-dev/lib/python3.7/site-packages/fsspec/spec.py:968: in open
    **kwargs,
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <fsspec.implementations.memory.MemoryFileSystem object at 0x7f7e66ebff90>
path = '/test/test.csv', mode = 'rb', block_size = None, autocommit = True
cache_options = None, kwargs = {}, parent = '/'

    def _open(
        self,
        path,
        mode="rb",
        block_size=None,
        autocommit=True,
        cache_options=None,
        **kwargs,
    ):
        path = self._strip_protocol(path)
        if path in self.pseudo_dirs:
            raise IsADirectoryError
        parent = path
        while len(parent) > 1:
            parent = self._parent(parent)
            if self.isfile(parent):
                raise FileExistsError(parent)
        if mode in ["rb", "ab", "rb+"]:
            if path in self.store:
                f = self.store[path]
                if mode == "ab":
                    # position at the end of file
                    f.seek(0, 2)
                else:
                    # position at the beginning of file
                    f.seek(0)
                return f
            else:
>               raise FileNotFoundError(path)
E               FileNotFoundError: /test/test.csv

/usr/share/miniconda/envs/pandas-dev/lib/python3.7/site-packages/fsspec/implementations/memory.py:179: FileNotFoundError
____________________________ test_markdown_options _____________________________
[gw1] linux -- Python 3.7.10 /usr/share/miniconda/envs/pandas-dev/bin/python

self = <pandas.conftest.fsspectest.<locals>.TestMemoryFS object at 0x7f7e62034110>
path = '/afile', start = None, end = None, kwargs = {}

    def cat_file(self, path, start=None, end=None, **kwargs):
        path = self._strip_protocol(path)
        try:
>           return self.store[path].getvalue()[start:end]
E           KeyError: '/afile'

/usr/share/miniconda/envs/pandas-dev/lib/python3.7/site-packages/fsspec/implementations/memory.py:200: KeyError

During handling of the above exception, another exception occurred:

fsspectest = <pandas.conftest.fsspectest.<locals>.TestMemoryFS object at 0x7f7e62034110>

    @td.skip_if_no("tabulate")
    def test_markdown_options(fsspectest):
        df = DataFrame({"a": [0]})
        df.to_markdown("testmem://afile", storage_options={"test": "md_write"})
        assert fsspectest.test[0] == "md_write"
>       assert fsspectest.cat("afile")

pandas/tests/io/test_fsspec.py:285: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
/usr/share/miniconda/envs/pandas-dev/lib/python3.7/site-packages/fsspec/spec.py:727: in cat
    return self.cat_file(paths[0], **kwargs)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <pandas.conftest.fsspectest.<locals>.TestMemoryFS object at 0x7f7e62034110>
path = '/afile', start = None, end = None, kwargs = {}

    def cat_file(self, path, start=None, end=None, **kwargs):
        path = self._strip_protocol(path)
        try:
            return self.store[path].getvalue()[start:end]
        except KeyError:
>           raise FileNotFoundError(path)
E           FileNotFoundError: /afile

/usr/share/miniconda/envs/pandas-dev/lib/python3.7/site-packages/fsspec/implementations/memory.py:202: FileNotFoundError

@lithomas1 lithomas1 mentioned this issue Oct 2, 2021
4 tasks
@jreback jreback modified the milestones: Contributions Welcome, 1.4 Oct 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI Continuous Integration Dependencies Required and optional dependencies IO Data IO issues that don't fit into a more specific label
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants