Skip to content

Commit fe09884

Browse files
committed
Make compression=infer the default, and update the docs
1 parent 48fd726 commit fe09884

File tree

3 files changed

+8
-6
lines changed

3 files changed

+8
-6
lines changed

doc/source/io.rst

+2-1
Original file line numberDiff line numberDiff line change
@@ -89,7 +89,8 @@ They can take a number of arguments:
8989
- ``delim_whitespace``: Parse whitespace-delimited (spaces or tabs) file
9090
(much faster than using a regular expression)
9191
- ``compression``: decompress ``'gzip'`` and ``'bz2'`` formats on the fly.
92-
Set to ``'infer'`` to guess a format based on the file extension.
92+
Set to ``'infer'`` (the default) to guess a format based on the file
93+
extension.
9394
- ``dialect``: string or :class:`python:csv.Dialect` instance to expose more
9495
ways to specify the file format
9596
- ``dtype``: A data type name or a dict of column name to data type. If not

doc/source/whatsnew/v0.16.1.txt

+2-1
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,6 @@ We recommend that all users upgrade to this version.
1616

1717
Enhancements
1818
~~~~~~~~~~~~
19-
- Setting the ``compression`` argument of ``read_csv`` or ``read_table`` to ``'infer'`` will now guess the compression type based on the file extension. (:issue:`9770`)
2019

2120

2221

@@ -35,6 +34,8 @@ API changes
3534
- Add support for separating years and quarters using dashes, for
3635
example 2014-Q1. (:issue:`9688`)
3736

37+
- By default, ``read_csv`` and ``read_table`` will now try to infer the compression type based on the file extension. Set ``compression=None`` to restore the previous behavior. (:issue:`9770`)
38+
3839
.. _whatsnew_0161.performance:
3940

4041
Performance Improvements

pandas/io/parsers.py

+4-4
Original file line numberDiff line numberDiff line change
@@ -55,10 +55,10 @@ class ParserWarning(Warning):
5555
dtype : Type name or dict of column -> type
5656
Data type for data or columns. E.g. {'a': np.float64, 'b': np.int32}
5757
(Unsupported with engine='python')
58-
compression : {'gzip', 'bz2', 'infer', None}, default None
58+
compression : {'gzip', 'bz2', 'infer', None}, default 'infer'
5959
For on-the-fly decompression of on-disk data. If 'infer', then use gzip or
6060
bz2 if filepath_or_buffer is a string ending in '.gz' or '.bz2',
61-
respectively, and None otherwise.
61+
respectively, and no decompression otherwise.
6262
dialect : string or csv.Dialect instance, default None
6363
If None defaults to Excel dialect. Ignored if sep longer than 1 char
6464
See csv.Dialect documentation for more details
@@ -296,7 +296,7 @@ def _read(filepath_or_buffer, kwds):
296296
'verbose': False,
297297
'encoding': None,
298298
'squeeze': False,
299-
'compression': None,
299+
'compression': 'infer',
300300
'mangle_dupe_cols': True,
301301
'tupleize_cols': False,
302302
'infer_datetime_format': False,
@@ -336,7 +336,7 @@ def _make_parser_function(name, sep=','):
336336
def parser_f(filepath_or_buffer,
337337
sep=sep,
338338
dialect=None,
339-
compression=None,
339+
compression='infer',
340340

341341
doublequote=True,
342342
escapechar=None,

0 commit comments

Comments
 (0)