Skip to content

ENH: Infer inner file name of zip archive (GH39465) #44445

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Nov 17, 2021

Conversation

marcelgerber
Copy link
Contributor

@marcelgerber marcelgerber commented Nov 14, 2021

relevant for DataFrame.to_csv and Series.to_csv with compression='zip'

This fix is similar-in-spirit to #40387, which has been abandoned.

Before / After

from pandas import pd

df = pd.DataFrame()
df.to_csv('../test.csv.zip')

Before

> unzip -l test.csv.zip

Archive:  test.csv.zip
  Length      Date    Time    Name
---------  ---------- -----   ----
        3  2021-11-14 13:39   ../test.csv.zip
---------                     -------
        3                     1 file

Notice the .. in the path - bad! And, of course, that the file inside the zip file is also called test.csv.zip.

After

> unzip -l test.csv.zip

Archive:  test.csv.zip
  Length      Date    Time    Name
---------  ---------- -----   ----
        3  2021-11-14 13:44   test.csv
---------                     -------
        3                     1 file

relevant for `DataFrame.to_csv` and `Series.to_csv` with `compression='zip'`
@marcelgerber
Copy link
Contributor Author

cc @datapythonista because you reviewed that last PR

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would need a whatsnew note as well I/O section of 1.4

@jreback jreback added the IO Data IO issues that don't fit into a more specific label label Nov 14, 2021
@jreback
Copy link
Contributor

jreback commented Nov 14, 2021

cc @twoertwein

@marcelgerber
Copy link
Contributor Author

Thank you for the quick review @jreback 🙌

@marcelgerber
Copy link
Contributor Author

is there anything still holding this off from merge?

@jreback jreback added this to the 1.4 milestone Nov 17, 2021
@jreback jreback merged commit b3f33a1 into pandas-dev:master Nov 17, 2021
@jreback
Copy link
Contributor

jreback commented Nov 17, 2021

thanks @marcelgerber very nice!

@marcelgerber marcelgerber deleted the fix-zip-inner-file-name branch November 17, 2021 06:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO Data IO issues that don't fit into a more specific label
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: DataFrame to_csv compression with 'zip' use zipfilename as archive name
3 participants