@@ -285,14 +285,18 @@ chunksize : int, default ``None``
285
285
Quoting, compression, and file format
286
286
+++++++++++++++++++++++++++++++++++++
287
287
288
- compression : {``'infer' ``, ``'gzip' ``, ``'bz2' ``, ``'zip' ``, ``'xz' ``, ``None ``}, default ``'infer' ``
288
+ compression : {``'infer' ``, ``'gzip' ``, ``'bz2' ``, ``'zip' ``, ``'xz' ``, ``None ``, `` dict `` }, default ``'infer' ``
289
289
For on-the-fly decompression of on-disk data. If 'infer', then use gzip,
290
290
bz2, zip, or xz if filepath_or_buffer is a string ending in '.gz', '.bz2',
291
291
'.zip', or '.xz', respectively, and no decompression otherwise. If using 'zip',
292
292
the ZIP file must contain only one data file to be read in.
293
- Set to ``None `` for no decompression.
293
+ Set to ``None `` for no decompression. Can also be a dict with key ``'method' ``
294
+ set to one of {``'zip' ``, ``'gzip' ``, ``'bz2' ``}, and other keys set to
295
+ compression settings. As an example, the following could be passed for
296
+ faster compression: ``compression={'method': 'gzip', 'compresslevel': 1} ``.
294
297
295
298
.. versionchanged :: 0.24.0 'infer' option added and set to default.
299
+ .. versionchanged :: 1.1.0 dict option extended to support ``gzip`` and ``bz2``.
296
300
thousands : str, default ``None ``
297
301
Thousands separator.
298
302
decimal : str, default ``'.' ``
@@ -3347,6 +3351,12 @@ The compression type can be an explicit parameter or be inferred from the file e
3347
3351
If 'infer', then use ``gzip ``, ``bz2 ``, ``zip ``, or ``xz `` if filename ends in ``'.gz' ``, ``'.bz2' ``, ``'.zip' ``, or
3348
3352
``'.xz' ``, respectively.
3349
3353
3354
+ The compression parameter can also be a ``dict `` in order to pass options to the
3355
+ compression protocol. It must have a ``'method' `` key set to the name
3356
+ of the compression protocol, which must be one of
3357
+ {``'zip' ``, ``'gzip' ``, ``'bz2' ``}. All other key-value pairs are passed to
3358
+ the underlying compression library.
3359
+
3350
3360
.. ipython :: python
3351
3361
3352
3362
df = pd.DataFrame({
@@ -3383,6 +3393,15 @@ The default is to 'infer':
3383
3393
rt = pd.read_pickle(" s1.pkl.bz2" )
3384
3394
rt
3385
3395
3396
+ Passing options to the compression protocol in order to speed up compression:
3397
+
3398
+ .. ipython :: python
3399
+
3400
+ df.to_pickle(
3401
+ " data.pkl.gz" ,
3402
+ compression = {" method" : " gzip" , ' compresslevel' : 1 }
3403
+ )
3404
+
3386
3405
.. ipython :: python
3387
3406
:suppress:
3388
3407
0 commit comments