-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ENH: DataFrame.to_csv support for "compression='gzip'" #7615
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Would accept a pull-request to limit this. |
I think this out-of-scope for Pandas--just use this: https://docs.python.org/2/library/gzip.html please close |
But compression='gzip' is accepted (and enacted) in pd.read_csv, which is why I was assuming to_csv behaves the same. |
The way you initially phrased the issue suggested that you were just guessing at keyword arguments--'compression' isn't a documented argument so I don't think your confusion is shared by many. You're welcome to submit a pull-request, I don't feel religious about this at all |
Sorry, what I meant is:
There is no error in the documentation and both (1) and (2) make sense to me. Closing on the grounds that I won't be fixing it myself, and probably it's not a proper bug. |
Thanks! I definitely didn't mean to antagonize you--agreed that it's an unfortunate inconsistency |
Would we want this feature, if someone would implement it? If so, we can leave it open marked as an enhancement proposal? |
I would also like to_csv to have the same functionality of from_csv. |
+1, a compression argument for In Python 3.4, I use the following workaround: with gzip.open('path_to_file', 'wt') as write_file:
data_frame.to_csv(write_file) |
@dhimmel If you're interested in putting in the work, I think we're still open to a PR to add this feature. |
@shoyer, okay I will keep this in mind. I have a bit to learn first. |
Thank you so much for implementing this! Besides the aesthetics POV and fixing the asymmetry between read/write, this is a huge improvement to some people like me. |
This did not work for me, the output file isn't compressed. I'm using Pandas 0.18.1 |
@jsmedmar could you open a new issue with that demonstrating the problem? Thanks. |
@jsmedmar I see the "compression" argument is properly documented and it is working One confusing thing is that if you run the following code
you get a compressed file, but opening it in vim automatically decompresses it, so to verify that compression happened use the "head" command: $ head test.csv.gz |
the DataFrame.to_csv method seems to accept a "compression" named parameter:
However, the file it creates is not compressed at all:
francesco@i3 ~/Desktop $ cat test.csv.gz
,a,b
0,0,1
1,2,3
2,4,5
3,6,7
4,8,9
How about either (i) actually implementing compression, or at least (ii) raise an error? The current behavior is confusing...
The text was updated successfully, but these errors were encountered: