Skip to content

Commit ccec95a

Browse files
committed
Update base IO documentation
1 parent f378746 commit ccec95a

File tree

1 file changed

+21
-2
lines changed

1 file changed

+21
-2
lines changed

doc/source/user_guide/io.rst

+21-2
Original file line numberDiff line numberDiff line change
@@ -285,14 +285,18 @@ chunksize : int, default ``None``
285285
Quoting, compression, and file format
286286
+++++++++++++++++++++++++++++++++++++
287287

288-
compression : {``'infer'``, ``'gzip'``, ``'bz2'``, ``'zip'``, ``'xz'``, ``None``}, default ``'infer'``
288+
compression : {``'infer'``, ``'gzip'``, ``'bz2'``, ``'zip'``, ``'xz'``, ``None``, ``dict``}, default ``'infer'``
289289
For on-the-fly decompression of on-disk data. If 'infer', then use gzip,
290290
bz2, zip, or xz if filepath_or_buffer is a string ending in '.gz', '.bz2',
291291
'.zip', or '.xz', respectively, and no decompression otherwise. If using 'zip',
292292
the ZIP file must contain only one data file to be read in.
293-
Set to ``None`` for no decompression.
293+
Set to ``None`` for no decompression. Can also be a dict with key ``'method'``
294+
set to one of {``'zip'``, ``'gzip'``, ``'bz2'``}, and other keys set to
295+
compression settings. As an example, the following could be passed for
296+
faster compression: ``compression={'method': 'gzip', 'compresslevel': 1}``.
294297

295298
.. versionchanged:: 0.24.0 'infer' option added and set to default.
299+
.. versionchanged:: 1.1.0 dict option extended to support ``gzip`` and ``bz2``.
296300
thousands : str, default ``None``
297301
Thousands separator.
298302
decimal : str, default ``'.'``
@@ -3347,6 +3351,12 @@ The compression type can be an explicit parameter or be inferred from the file e
33473351
If 'infer', then use ``gzip``, ``bz2``, ``zip``, or ``xz`` if filename ends in ``'.gz'``, ``'.bz2'``, ``'.zip'``, or
33483352
``'.xz'``, respectively.
33493353

3354+
The compression parameter can also be a ``dict`` in order to pass options to the
3355+
compression protocol. It must have a ``'method'`` key set to the name
3356+
of the compression protocol, which must be one of
3357+
{``'zip'``, ``'gzip'``, ``'bz2'``}. All other key-value pairs are passed to
3358+
the underlying compression library.
3359+
33503360
.. ipython:: python
33513361
33523362
df = pd.DataFrame({
@@ -3383,6 +3393,15 @@ The default is to 'infer':
33833393
rt = pd.read_pickle("s1.pkl.bz2")
33843394
rt
33853395
3396+
Passing options to the compression protocol in order to speed up compression:
3397+
3398+
.. ipython:: python
3399+
3400+
df.to_pickle(
3401+
"data.pkl.gz",
3402+
compression={"method": "gzip", 'compresslevel': 1}
3403+
)
3404+
33863405
.. ipython:: python
33873406
:suppress:
33883407

0 commit comments

Comments
 (0)