Skip to content

REF: Mock all S3 Tests #20409

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Mar 23, 2018
Merged

REF: Mock all S3 Tests #20409

merged 2 commits into from
Mar 23, 2018

Conversation

TomAugspurger
Copy link
Contributor

Closes #19825

@TomAugspurger TomAugspurger added Testing pandas testing functions or related to the test suite IO Data IO issues that don't fit into a more specific label IO Network Local or Cloud (AWS, GCS, etc.) IO Issues labels Mar 19, 2018
Key="large-file.csv",
Body=buf)

with caplog.at_level(logging.DEBUG, logger='s3fs.core'):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@martindurant does this seem like a reasonable way to test that read_csv(key, nrows=5) only triggers S3FS reading part of the object? Do you know of a better way, that's perhaps less reliant on the internals of S3FS?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid I don't have a better method for you, s3fs doesn't keep a log of transactions in any data structure you could access, and the s3file used for the download will have been cleaned up as soon as read_csv is done with it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm OK. Do you know if moto keeps a record anywhere?

I dislike this test since s3fs adding an additional logger.debug anywhere, or changing the log message, default bytes size, etc. will break it.

boto also has a callback mechanism on download_file, but I don't see that option for get_object. If I can't figure out a way to get that working, I'll try to make the test using the logger a bit less fragile.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Certainly you could look through all log messages captured, not just the last one. Note that you do have access to the exact s3filesystem in S3FileSystem._singleton[0], but I don't see that that helps you in this case.

You could maybe patch S3File.__exit__ to store the values of self.loc and self.end?

@codecov
Copy link

codecov bot commented Mar 19, 2018

Codecov Report

Merging #20409 into master will increase coverage by 0.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #20409      +/-   ##
==========================================
+ Coverage   91.77%   91.79%   +0.01%     
==========================================
  Files         152      152              
  Lines       49205    49223      +18     
==========================================
+ Hits        45159    45182      +23     
+ Misses       4046     4041       -5
Flag Coverage Δ
#multiple 90.17% <ø> (+0.01%) ⬆️
#single 41.84% <ø> (-0.02%) ⬇️
Impacted Files Coverage Δ
pandas/util/testing.py 83.91% <0%> (-0.05%) ⬇️
pandas/core/window.py 96.26% <0%> (-0.01%) ⬇️
pandas/core/series.py 93.84% <0%> (ø) ⬆️
pandas/core/generic.py 95.85% <0%> (ø) ⬆️
pandas/core/indexes/category.py 97.3% <0%> (ø) ⬆️
pandas/core/arrays/categorical.py 96.21% <0%> (ø) ⬆️
pandas/core/panel.py 97.29% <0%> (ø) ⬆️
pandas/core/base.py 96.78% <0%> (ø) ⬆️
pandas/core/indexes/datetimelike.py 96.72% <0%> (ø) ⬆️
pandas/core/frame.py 97.18% <0%> (ø) ⬆️
... and 5 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7273ea0...69adc3d. Read the comment docs.

@jreback jreback added this to the 0.23.0 milestone Mar 19, 2018
@jreback
Copy link
Contributor

jreback commented Mar 19, 2018

lgtm. @TomAugspurger merge when you are ok with this.

@TomAugspurger TomAugspurger merged commit 0699659 into pandas-dev:master Mar 23, 2018
@TomAugspurger TomAugspurger deleted the s3 branch March 23, 2018 19:24
javadnoorb pushed a commit to javadnoorb/pandas that referenced this pull request Mar 29, 2018
dworvos pushed a commit to dworvos/pandas that referenced this pull request Apr 2, 2018
kornilova203 pushed a commit to kornilova203/pandas that referenced this pull request Apr 23, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO Data IO issues that don't fit into a more specific label IO Network Local or Cloud (AWS, GCS, etc.) IO Issues Testing pandas testing functions or related to the test suite
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants