-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Add ZIP file decompression and TestCompression. #12175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
klass = FixedWidthFieldParser | ||
else: #default to engine == 'python': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why did you modify this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I modified this to default to the Python engine. If this is not the wanted functionality. I can remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just change code that is relevant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will do. Does this need to a separate PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure why you are changing this in the first place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the use case that engine is neither 'python'
nor 'python-fwf'
.
Is it possible for this to happen?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, that would be an exception and should raise a ValueError
(do this in another issue/PR)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, thanks for the clarification.
@jreback The idea of the Mixin makes sense. Took me a bit to wrap my head around it. I had to add |
@jreback I think this is ready for your review. |
f = zip_file.open(file_name) | ||
else: | ||
raise ValueError('ZIP file contains multiple files {}', | ||
zip_file.filename) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need a .format
here
Jeff is going to ask you to squash your commits into one, as per the contributing docs. |
@MaximilianR no need for squashing anymore thanks to https://github.com/pydata/pandas/blob/master/scripts/merge-py.py ;) EDIT: which reminds me, CONTRIBUTING.md needs to be updated. Will do this weekend unless someone beats me to it. |
@MaximilianR @TomAugspurger Thanks for the comments guys! |
@TomAugspurger Cool! |
pls squash even though I can do it - it's much cleaner from a future reader perspective on a smaller change |
@jreback Squashed. Thanks. |
result = self.read_csv(path, compression='zip') | ||
tm.assert_frame_equal(result, expected) | ||
|
||
result = self.read_csv(open(path, 'rb'), compression='zip') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do this in a with block to make sure the file is closed
looks pretty good. just some minor stylistic comments. |
put a test in for this error as well (e.g. try to open a non-zipfile); I don't think you need any code changes though, you can just let it raise. |
@jreback Good ideas. I think I covered all your requests. |
@lababidi ok, this lgtm. just need a whatsnew note (you can put kind of what you did in the addition to the doc-string). pls add, ping when green. |
ccfc4a7
to
032362c
Compare
|
||
file_name = 'test_file.zip' | ||
with tm.ensure_clean(file_name) as path: | ||
tmp = zipfile.ZipFile(path, mode='w') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you didn't change anything here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jreback what line are you referring to? zipfile.ZipFile
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, your context managers are nested, but don't need to be
with open(self.csv1, 'rb') as data_file:
data = data_file.read()
data can be used here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jreback thanks for clarifying. I was using the previous convention. I'll clean this up now.
compression : {'gzip', 'bz2', 'zip', 'infer', None}, default 'infer' | ||
For on-the-fly decompression of on-disk data. If 'infer', then use gzip, | ||
bz2 or zip if filepath_or_buffer is a string ending in '.gz', '.bz2' or | ||
'.zip', respectively, and no decompression otherwise. New in 0.18.0: ZIP |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so add in a versionadded tag and put the 0.18.0 stuff there
@jreback previous tests passed. The most recent push only changed the version in the docstring |
compression : {'gzip', 'bz2', 'zip', 'infer', None}, default 'infer' | ||
For on-the-fly decompression of on-disk data. If 'infer', then use gzip, | ||
bz2 or zip if filepath_or_buffer is a string ending in '.gz', '.bz2' or | ||
'.zip', respectively, and no decompression otherwise. New in 0.18.1: ZIP |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need a versionadded tag here
@lababidi ok looks pretty good. only changes are to use assertRaisesRegexp when you assert for zip in order to make sure the correct messages are raised (as there are multiple possibilites). pls make that and ping when green. |
f57a7ee
to
daae2aa
Compare
@jreback could you help me? the test only failed on the following:
|
not sure where you are seeing this. your 2.7 tests is failing because of a linting issue (line too long)
|
@jreback it's in the Travis results |
I restarted that job, though you need to repush anyhow (lint error). Never saw that one before; I think its a crash in something else, so let's see if it recurs. |
@lababidi ok this passed, except for the lint check. pls fix and repush, ping when green. |
Fix PEP8 issues. Change Compression to be a Mixin. Add Compression Mixin correctly with current Tests. Add .format, Rename Compression, with-block, empty zip, bad-zip
Thank you @jreback for your help and patience with this. I'll help out on the other issues soon. |
@lababidi no thank you! |
Closes #11413