-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
DOC for refactored compression (GH14576) + BUG: bz2-compressed URL with C engine (GH14874) #14880
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
a7960f6
85630ea
cb91007
210fb20
0e0fa0a
f8a7900
c4ea3d3
09dcbff
8568aed
e1b5d42
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -64,6 +64,27 @@ Strings passed to ``DataFrame.groupby()`` as the ``by`` parameter may now refere | |
|
||
df.groupby(['second', 'A']).sum() | ||
|
||
.. _whatsnew_0200.enhancements.compressed_urls: | ||
|
||
Better support for compressed URLs in ``read_csv`` | ||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
||
The compression code was refactored (:issue:`12688`). As a result, reading | ||
dataframes from URLs in :func:`read_csv` or :func:`read_table` now supports | ||
additional compression methods: ``xz``, ``bz2``, and ``zip`` (:issue:`14570`). | ||
Previously, only ``gzip`` compression was supported. By default, compression of | ||
URLs and paths are now both inferred using their file extensions. Additionally, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Additionally, support for bz2 compress in the python 2 c-engine improved. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Addressed comments 1 and 3 in e1b5d42. @jreback, I didn't change:
Previously, compression of paths was by default inferred from their extension, but not URLs. Now both are inferred by their extension. Am I missing something? |
||
support for bz2 compression in the python 2 c-engine improved (:issue:`14874`). | ||
|
||
.. ipython:: python | ||
url = 'https://github.com/{repo}/raw/{branch}/{path}'.format( | ||
repo = 'pandas-dev/pandas', | ||
branch = 'master', | ||
path = 'pandas/io/tests/parser/data/salaries.csv.bz2', | ||
) | ||
df = pd.read_table(url, compression='infer') # default, infer compression | ||
df = pd.read_table(url, compression='bz2') # explicitly specify compression | ||
df.head(2) | ||
|
||
.. _whatsnew_0200.enhancements.other: | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if there are any other issues that were closed by this, pls list them as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rechecked... they're all already listed.