-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Allow reading SAS files from archives #47154
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 6 commits
1b720e9
6448ed2
b260806
043dd96
67f3632
52d79f4
2306289
e917a3b
11b9bf6
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -20,7 +20,15 @@ def test_sas_buffer_format(self): | |
|
||
def test_sas_read_no_format_or_extension(self): | ||
# see gh-24548 | ||
msg = "unable to infer format of SAS file" | ||
msg = "unable to infer format of SAS file.+" | ||
with tm.ensure_clean("test_file_no_extension") as path: | ||
with pytest.raises(ValueError, match=msg): | ||
read_sas(path) | ||
|
||
|
||
def test_sas_archive(datapath): | ||
fname_uncompressed = datapath("io", "sas", "data", "airline.sas7bdat") | ||
df_uncompressed = read_sas(fname_uncompressed) | ||
fname_compressed = datapath("io", "sas", "data", "airline.sas7bdat.gz") | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It is worth adding a gzip'd file? Could the gzip file be created on the fly during the test? This would also let you test other supported formats, e.g., zip There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We do have There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think it is essential to use this method - I just wonder about the longer term future since it isn't strictly necessary to add compressed versions of files to the main repo. I'm happy to sign off on it as it is now. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My 2c. I would prefer these files to by gzipped on the fly in the test such that files in the main repo are absolutely necessary. Doesn't have to be for this PR but a followup would be appreciated |
||
df_compressed = read_sas(fname_compressed, format="sas7bdat") | ||
tm.assert_frame_equal(df_uncompressed, df_compressed) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No indent here for the
%(...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See question in other comment thread. I think your suggestion is wrong.