Skip to content

Using os.PathLike instead of pathlib.Path (#37979) #38018

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Dec 2, 2020

Conversation

Abhi-H
Copy link
Contributor

@Abhi-H Abhi-H commented Nov 23, 2020

Replaces pathlib.Path with os.PathLike to bring behaviour further in-line with the documentation

@Abhi-H Abhi-H changed the title Using os.PathLike instead of pathlib.Path (pandas-dev#37979) Using os.PathLike instead of pathlib.Path (#37979) Nov 23, 2020
Copy link
Member

@arw2019 arw2019 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Abhi-H for the PR

Mypy complaints to resolve:

pandas/io/common.py:192: error: Argument 1 to "_expand_user" has incompatible type "Union[Any, _PathLike[str], str, IO[str], RawIOBase, BufferedIOBase, TextIOBase, mmap]"; expected "Union[str, Union[IO[str], RawIOBase, BufferedIOBase, TextIOBase, TextIOWrapper, mmap]]"  [arg-type]
pandas/io/common.py:192: error: Argument 1 to "_expand_user" has incompatible type "Union[Any, _PathLike[str], str, IO[bytes], RawIOBase, BufferedIOBase, TextIOBase, TextIOWrapper, mmap]"; expected "Union[str, Union[IO[bytes], RawIOBase, BufferedIOBase, TextIOBase, TextIOWrapper, mmap]]"  [arg-type]

@jreback jreback added the Typing type annotations, mypy/pyright type checking label Nov 24, 2020
@jreback jreback modified the milestone: 1.2 Nov 24, 2020
@jreback
Copy link
Contributor

jreback commented Nov 24, 2020

@Abhi-H see if you can make mypy happy.

@Abhi-H
Copy link
Contributor Author

Abhi-H commented Nov 24, 2020

Working on fixing the mypy errors.

Copy link
Member

@arw2019 arw2019 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Abhi-H seems you have a merge error?

FWIW mypy is happy now although I'm not sure the change you were making is still here

@Abhi-H
Copy link
Contributor Author

Abhi-H commented Nov 27, 2020

@arw2019 Is there still a merge error? I don't quite understand what those are, so a link or guidance on how I could resolve it would be very helpful. I am assuming that this is related to the git side of things as the code seems to pass all checks, so I can quickly resolve it!
Also, just checked and the changes I was making are there.

@Abhi-H
Copy link
Contributor Author

Abhi-H commented Nov 27, 2020

With some git-magic and help from a friend, I think I fixed the merge conflict. Let me know if this is fine!

@@ -7,7 +7,7 @@
from io import BufferedIOBase, BytesIO, RawIOBase, TextIOWrapper
import mmap
import os
import pathlib
from os import PathLike
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

os is already imported in the line above, you can simply use os.PathLike

@@ -187,7 +187,7 @@ def stringify_path(
# error: Item "IO[bytes]" of "Union[str, Path, IO[bytes]]" has no
# attribute "__fspath__" [union-attr]
filepath_or_buffer = filepath_or_buffer.__fspath__() # type: ignore[union-attr]
elif isinstance(filepath_or_buffer, pathlib.Path):
elif isinstance(filepath_or_buffer, PathLike):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

os.PathLike defines that its implementations return the representation of the path when calling __fspath__(). This means that the elif-block should never get executed (and was never executed?) as the if-block checks for the existence of __fspath__:

if hasattr(filepath_or_buffer, "__fspath__"):

If all objects that provide __fspath__ also inherit from os.PathLike (I don't know whether that is the case - maybe temporarily put a assert False in the elif block), we can probably avoid the mypy ignore statement in the if-block by simple replacing the if/elif-block with:

if isinstance(filepath_or_buffer, PathLike):
    filepath_or_buffer = filepath_or_buffer.__fspath__()

Copy link
Contributor Author

@Abhi-H Abhi-H Nov 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense, the elif block was never being executed after all. I think it was being used more as a placeholder to keep mypy happy as mypy raises an incompatible type error if the elif block removed because it does not register that the filepath_or_buffer has been changed to str type by the __fspath__ in the if block. It only seems to register this type change and not raise an error due to the elif block even if the block is actually never executed.
An alternate approach that I am considering is:

if isinstance(filepath_or_buffer, os.PathLike):
       filepath_or_buffer = filepath_or_buffer.__fspath__()
elif hasattr(filepath_or_buffer, "__fspath__"):
        assert False

This satisfies mypy without the need for ignore statements and the assert False would raise an error in the future if an object that provides __fspath__ and does not inherit from os.PathLike reaches this function.
What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mypy doesn't seem to complain (for me locally) if you just have this:

if isinstance(filepath_or_buffer, os.PathLike):
       filepath_or_buffer = filepath_or_buffer.__fspath__()

Copy link
Contributor Author

@Abhi-H Abhi-H Nov 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, using only isinstance will agree with mypy as well as all the tests. My thought process behind the asserts False was not related to mypy. It was more to raise a flag if future versions of the code attempted to pass an object which is not os.PathLike yet has __fspath__ defined for it as the piece:

if isinstance(filepath_or_buffer, os.PathLike):
       filepath_or_buffer = filepath_or_buffer.__fspath__()

will not be calling __fspath for an object that does not inherit from os.PathLike so developers will maybe be confused about this behaviour.
However, I'm not sure how useful such a warning generated by the asserts False will be so I could just stick with:

if isinstance(filepath_or_buffer, os.PathLike):
       filepath_or_buffer = filepath_or_buffer.__fspath__()

and submit the code for review

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

idk, I assume that pandas devs either want raise SomeError("...") in the elif-block or no elif-block at all @jreback

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about this and feel like doing away with the elif block altogether is the way to go, as mypy will warn developers if they try to pass an object that does not inherit from os.PathLike through the function. I'll change the PR to only include:

if isinstance(filepath_or_buffer, os.PathLike):
       filepath_or_buffer = filepath_or_buffer.__fspath__()

@Abhi-H
Copy link
Contributor Author

Abhi-H commented Nov 29, 2020

I am a little stuck here. The tests that are failing in these builds are not failing when I run them locally. What could be causing this?

@twoertwein
Copy link
Member

twoertwein commented Nov 29, 2020

I think the only test that is failing in this PR but not on master is test_to_csv_compression_encoding_gcs. Can you run that test locally to see whether it works for you pytest -k test_to_csv_compression_encoding_gcs pandas/tests/io/test_gcs.py. I will test it later today on my machine on linux.

edit: I don't get the above error. The removed hasattr(...) shouldn't have an impact as the testcase uses already a string. Maybe just rebase and hope for the best: git pull --rebase upstream master

@Abhi-H
Copy link
Contributor Author

Abhi-H commented Nov 29, 2020

Thanks for checking it out, I tried and didn't get the error either when running locally either. Have rebased and pushed a second time.

@jreback jreback added this to the 1.2 milestone Dec 2, 2020
@jreback jreback merged commit c4c1dc3 into pandas-dev:master Dec 2, 2020
@jreback
Copy link
Contributor

jreback commented Dec 2, 2020

thanks @Abhi-H very nice!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Typing type annotations, mypy/pyright type checking
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Should FilePathorBuffer use os.PathLike instead of pathlib.Path?
4 participants