-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ENH Enable bzip2 streaming for Python 3 #11072
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
tests! |
9c95929
to
5e2b1b7
Compare
I added a test for reading from an open file with the C parser. It fails on the master branch and passes here. How's that? |
do you have exactly the same deps |
Yes, exactly the same dependencies. This PR works because the standard library |
ok, this looks good. pls add a note in whatsnew for 0.17.0 (just released the rc1 yesterday, but this is ok). reference both the original issue and this PR number I think. squash & ping when green. |
5e2b1b7
to
7fb36b7
Compare
Note added. It doesn't look like anything else references a PR; should I leave that reference in? |
@@ -465,6 +465,8 @@ Other enhancements | |||
|
|||
- Improved error message when concatenating an empty iterable of dataframes (:issue:`9157`) | |||
|
|||
- ``pd.read_csv`` can now read bz2-compressed files incrementally, and the C parser can read bz2-compressed files from AWS S3 (:issue:`110701`, :pr:`11072`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just reference it like an issue :issue:
11072``, we don't distinguish
Python 2 can't read bz2 files, but Python 3 can. Python 3 can also read bzip files one piece at a time.
7fb36b7
to
636afbe
Compare
@jreback , tests are green! |
ENH Enable bzip2 streaming for Python 3
thanks! |
This is the one modification related to issue #11070 which affects non-S3 interactions with
read_csv
. The Python 3 standard library has an improved capability for handling bz2 compression, so a simple change will letread_csv
stream bz2-compressed files.