Skip to content

Avoid calling S3File.s3 #27777

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Aug 12, 2019
Merged

Conversation

CJStadler
Copy link
Contributor

@CJStadler CJStadler commented Aug 6, 2019

When reading from S3 using fastparquet. This attribute was removed in
s3fs 0.3.0. This change avoids accessing it by using a new method
get_file_and_filesystem which returns the filesystem in addition to the
file.

When reading from s3 using fastparquet. This attribute was removed in
s3fs 0.3.0. This change avoids accessing it by using a new method
get_file_and_filesystem which returns the filesystem in addition to the
file.
@TomAugspurger
Copy link
Contributor

Can you verify that we have CI coverage for s3fs 0.2.x and 0.3.x? Check in ci/deps/environment*.yml.

@TomAugspurger TomAugspurger added this to the 0.25.1 milestone Aug 6, 2019
@TomAugspurger TomAugspurger added IO Data IO issues that don't fit into a more specific label IO Parquet parquet, feather labels Aug 6, 2019
@CJStadler
Copy link
Contributor Author

@TomAugspurger here are the references to s3fs there:

ci/deps/travis-37.yaml
20:  - s3fs

ci/deps/azure-windows-37.yaml
21:  - s3fs

ci/deps/travis-36-slow.yaml
21:  - s3fs

ci/deps/travis-36-locale.yaml
29:  - s3fs=0.0.8

ci/deps/azure-37-locale.yaml
22:  - s3fs

ci/deps/azure-36-locale_slow.yaml
22:  - s3fs

ci/deps/travis-36-cov.yaml
32:  - s3fs

Should we specify versions for some of them?

@CJStadler
Copy link
Contributor Author

The tests failures on CI don't look like they are related to this change. I'm also confused why the codecov checks are failing.

Should I add a whatsnew entry for this? In the 0.25.1 file?

Thanks!

@TomAugspurger
Copy link
Contributor

You can ignore codecov.

Whatsnew can go in 0.25.1.rst.

We just ensure that we have at least one with the minimum pin.

As long as we have a test with both s3fs 0.3.1 and pyarrow, so that this test is hit in CI, then I think we're good.

@TomAugspurger
Copy link
Contributor

@CJStadler are we hitting this with any tests? Given that our CI wasn't failing, I suspect we'll need a new test to ensure that we're hitting the relevant code path.

@CJStadler
Copy link
Contributor Author

@TomAugspurger the lines are covered so I don't think a new test is needed – CI must not have run with one of the affected versions of s3fs. On my machine TestParquetFastParquet.test_s3_roundtrip fails on master for s3fs 0.3.0-0.3.2 and passes on my branch.

It looks like s3fs just released 0.3.3 which resolves this, and that test now passes on master.

Copy link
Contributor

@TomAugspurger TomAugspurger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. Merging later today, to give other maintainers a chance to look.

Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generally lgtm - comments can be addressed here or as follow up

@WillAyd
Copy link
Member

WillAyd commented Aug 8, 2019 via email

Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@WillAyd
Copy link
Member

WillAyd commented Aug 9, 2019

OK by me @TomAugspurger

@TomAugspurger TomAugspurger merged commit 35821a5 into pandas-dev:master Aug 12, 2019
@TomAugspurger
Copy link
Contributor

Thanks @CJStadler!

meeseeksmachine pushed a commit to meeseeksmachine/pandas that referenced this pull request Aug 12, 2019
quintusdias pushed a commit to quintusdias/pandas_dev that referenced this pull request Aug 16, 2019
* Avoid calling S3File.s3

When reading from s3 using fastparquet. This attribute was removed in
s3fs 0.3.0. This change avoids accessing it by using a new method
get_file_and_filesystem which returns the filesystem in addition to the
file.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO Data IO issues that don't fit into a more specific label IO Parquet parquet, feather
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Error reading parquet from s3 with s3fs >= 0.3.0
4 participants