-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: read_csv
with chained fsspec TAR file and compression="infer"
#60100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
tests needs |
Use the decorator |
CI failures seems unrelated cc: @rhshadrach |
pandas/tests/io/test_common.py
Outdated
pd.read_csv(f"tar://test.csv::file://{tar_file}", compression=None) | ||
pd.read_csv(f"tar://test.csv::file://{tar_file}", compression="infer") | ||
except Exception as e: | ||
pytest.fail(e) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the benefit of using pytest.fail as opposed to just removing the try-except entirely?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was initially thinking that I need to use somekind of pytest function to validate. But it seems like it wouldn't be the case for this
pandas/tests/io/test_common.py
Outdated
pd.read_csv(f"tar://test.csv::file://{tar_file}", compression=None) | ||
pd.read_csv(f"tar://test.csv::file://{tar_file}", compression="infer") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you check the results are as expected.
pandas/tests/io/test_common.py
Outdated
@@ -642,6 +643,16 @@ def close(self): | |||
handles.created_handles.append(TestError()) | |||
|
|||
|
|||
@td.skip_if_no("fsspec") | |||
def test_read_csv_chained_url_no_error(): | |||
tar_file = "pandas/tests/io/data/tar/test-csv.tar" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a comment with the issue number as the first line
Improved the test function to validate the dataframe inputs as well
|
pandas/tests/io/test_common.py
Outdated
x_to_json_expected_output = '{"1;2":{"0":"3;4"}}' | ||
y_to_json_expected_output = '{"1;2":{"0":"3;4"}}' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you just create the expected DataFrame and use tm.assert_frame_equal
. Conversion JSON is lossy.
CI looks unrelated |
The test that's failing is the one being added here. |
Any ideas to fix it? Since the CI was |
Looks like a bug with the version of fsspec being tested ( |
pandas/tests/io/test_common.py
Outdated
chained_file_url = f"tar://test.csv::file://{tar_file_path}" | ||
|
||
result_a = pd.read_csv(chained_file_url, compression=None, sep=";") | ||
result_b = pd.read_csv(chained_file_url, compression="infer", sep=";") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you use pytest.mark.parametrize
over the compression
parameter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
read_csv
with chained fsspec TAR file and compression="infer"
fails with tarfile.ReadError
read_csv
with chained fsspec TAR file and compression="infer"
Thanks @KevsterAmp |
doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.