-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Avoid calling S3File.s3 #27777
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid calling S3File.s3 #27777
Conversation
When reading from s3 using fastparquet. This attribute was removed in s3fs 0.3.0. This change avoids accessing it by using a new method get_file_and_filesystem which returns the filesystem in addition to the file.
Can you verify that we have CI coverage for s3fs 0.2.x and 0.3.x? Check in |
@TomAugspurger here are the references to s3fs there:
Should we specify versions for some of them? |
The tests failures on CI don't look like they are related to this change. I'm also confused why the codecov checks are failing. Should I add a whatsnew entry for this? In the 0.25.1 file? Thanks! |
You can ignore codecov. Whatsnew can go in 0.25.1.rst. We just ensure that we have at least one with the minimum pin. As long as we have a test with both s3fs 0.3.1 and pyarrow, so that this test is hit in CI, then I think we're good. |
Co-Authored-By: gfyoung <[email protected]>
@CJStadler are we hitting this with any tests? Given that our CI wasn't failing, I suspect we'll need a new test to ensure that we're hitting the relevant code path. |
@TomAugspurger the lines are covered so I don't think a new test is needed – CI must not have run with one of the affected versions of s3fs. On my machine It looks like s3fs just released 0.3.3 which resolves this, and that test now passes on master. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good. Merging later today, to give other maintainers a chance to look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
generally lgtm - comments can be addressed here or as follow up
Yep. Most of these should be simple. There is a FilePathOrBuffer union type you can import from pandas._typing as well
…Sent from my iPhone
On Aug 8, 2019, at 9:24 AM, Chris Stadler ***@***.***> wrote:
@CJStadler commented on this pull request.
In pandas/io/s3.py:
> @@ -14,17 +14,15 @@ def _strip_schema(url):
return result.netloc + result.path
-def get_filepath_or_buffer(
- filepath_or_buffer, encoding=None, compression=None, mode=None
-):
+def get_file_and_filesystem(filepath_or_buffer, encoding=None, mode=None):
Cool, I didn't realize pandas was using type annotations.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub, or mute the thread.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
OK by me @TomAugspurger |
Thanks @CJStadler! |
* Avoid calling S3File.s3 When reading from s3 using fastparquet. This attribute was removed in s3fs 0.3.0. This change avoids accessing it by using a new method get_file_and_filesystem which returns the filesystem in addition to the file.
When reading from S3 using fastparquet. This attribute was removed in
s3fs 0.3.0. This change avoids accessing it by using a new method
get_file_and_filesystem
which returns the filesystem in addition to thefile.
black pandas
git diff upstream/master -u -- "*.py" | flake8 --diff