-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
REF: Mock all S3 Tests #20409
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
REF: Mock all S3 Tests #20409
Conversation
Key="large-file.csv", | ||
Body=buf) | ||
|
||
with caplog.at_level(logging.DEBUG, logger='s3fs.core'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@martindurant does this seem like a reasonable way to test that read_csv(key, nrows=5)
only triggers S3FS reading part of the object? Do you know of a better way, that's perhaps less reliant on the internals of S3FS?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm afraid I don't have a better method for you, s3fs doesn't keep a log of transactions in any data structure you could access, and the s3file used for the download will have been cleaned up as soon as read_csv is done with it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm OK. Do you know if moto keeps a record anywhere?
I dislike this test since s3fs adding an additional logger.debug
anywhere, or changing the log message, default bytes size, etc. will break it.
boto also has a callback mechanism on download_file
, but I don't see that option for get_object
. If I can't figure out a way to get that working, I'll try to make the test using the logger a bit less fragile.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Certainly you could look through all log messages captured, not just the last one. Note that you do have access to the exact s3filesystem in S3FileSystem._singleton[0], but I don't see that that helps you in this case.
You could maybe patch S3File.__exit__
to store the values of self.loc
and self.end
?
Codecov Report
@@ Coverage Diff @@
## master #20409 +/- ##
==========================================
+ Coverage 91.77% 91.79% +0.01%
==========================================
Files 152 152
Lines 49205 49223 +18
==========================================
+ Hits 45159 45182 +23
+ Misses 4046 4041 -5
Continue to review full report at Codecov.
|
lgtm. @TomAugspurger merge when you are ok with this. |
* REF: Mock all S3 Tests Closes pandas-dev#19825
* REF: Mock all S3 Tests Closes pandas-dev#19825
* REF: Mock all S3 Tests Closes pandas-dev#19825
Closes #19825