Skip to content

Enhancement dataframe to csv use zip method #39662

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Show file tree
Hide file tree
Changes from 8 commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
864f372
Merge pull request #1 from pandas-dev/master
CyberQin Feb 8, 2021
52b8869
enhancement to dataframe to_csv zip method,add this step to perform l…
CyberQin Feb 8, 2021
40af498
fix test test_to_csv_zip_arguments
CyberQin Feb 8, 2021
f69cb82
test_to_csv_compression_encoding_gcs
CyberQin Feb 9, 2021
3a4fb92
fix pre-commit test failed
CyberQin Feb 9, 2021
6b1314d
fix pre-commit test failed
CyberQin Feb 9, 2021
2343eb4
try fix ci errors
CyberQin Feb 9, 2021
127b255
Merge remote-tracking branch 'upstream/master' into enhancement_dataf…
CyberQin Feb 10, 2021
e238b76
Merge pull request #2 from pandas-dev/master
CyberQin Mar 10, 2021
2b5c645
add test for to_csv with filename=='xxxx.csv.zip',filename in zipfile…
CyberQin Mar 10, 2021
8bd2619
add test for to_csv with filename=='xxxx.csv.zip',filename in zipfile…
CyberQin Mar 10, 2021
a2504d0
Merge remote-tracking branch 'origin/enhancement_dataframe_to_csv_use…
CyberQin Mar 10, 2021
f9a97bd
pre-commit-fix
CyberQin Mar 10, 2021
9dea00a
pre-commit-fix2
CyberQin Mar 10, 2021
e20c663
pre-commit-fix3
CyberQin Mar 10, 2021
ebb298d
enhancement to dataframe to_csv zip method,add this step to perform l…
CyberQin Feb 8, 2021
437510c
fix test test_to_csv_zip_arguments
CyberQin Feb 8, 2021
5ce7650
test_to_csv_compression_encoding_gcs
CyberQin Feb 9, 2021
4cfd964
fix pre-commit test failed
CyberQin Feb 9, 2021
7c1bad8
fix pre-commit test failed
CyberQin Feb 9, 2021
def0289
try fix ci errors
CyberQin Feb 9, 2021
799a2d5
add test for to_csv with filename=='xxxx.csv.zip',filename in zipfile…
CyberQin Mar 10, 2021
86ad72f
pre-commit-fix
CyberQin Mar 10, 2021
0644612
pre-commit-fix2
CyberQin Mar 10, 2021
2b138b6
pre-commit-fix3
CyberQin Mar 10, 2021
958f088
pre-commit-fix5
CyberQin Mar 11, 2021
f21de23
Merge remote-tracking branch 'origin/enhancement_dataframe_to_csv_use…
CyberQin Mar 11, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion pandas/io/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -725,7 +725,10 @@ def __init__(
mode = mode.replace("b", "")
self.archive_name = archive_name
self.multiple_write_buffer: Optional[Union[StringIO, BytesIO]] = None

if archive_name is None and isinstance(file, (os.PathLike, str)):
archive_name = os.path.basename(file)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably better to replace os.path.basename with stringify_path (to handle PathLike objects). That might also fix the failing test_to_csv_compression_encoding_gcs test.

Copy link
Author

@CyberQin CyberQin Feb 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its strange that this test passed on my win10 laptop.
image

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, you probably still need basename to avoid having the entire path within the archive

Copy link
Author

@CyberQin CyberQin Feb 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

revert to os.path.basename , but start get different ci errors, does my commit affect it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a small test case. For example, write a compressed CSV file and then open it with zipfile.ZipFile to check the name.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add test case and finish pre commit, a new PR is opened #40387

if archive_name.endswith(".zip"):
self.archive_name = archive_name[:-4]
kwargs_zip: Dict[str, Any] = {"compression": zipfile.ZIP_DEFLATED}
kwargs_zip.update(kwargs)

Expand Down
2 changes: 1 addition & 1 deletion pandas/tests/io/formats/test_to_csv.py
Original file line number Diff line number Diff line change
Expand Up @@ -546,7 +546,7 @@ def test_to_csv_zip_arguments(self, compression, archive_name):
path, compression={"method": compression, "archive_name": archive_name}
)
with ZipFile(path) as zp:
expected_arcname = path if archive_name is None else archive_name
expected_arcname = path[:-4] if archive_name is None else archive_name
expected_arcname = os.path.basename(expected_arcname)
assert len(zp.filelist) == 1
archived_file = os.path.basename(zp.filelist[0].filename)
Expand Down