Skip to content

DOC: update the parquet docstring #20129

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Mar 12, 2018
30 changes: 27 additions & 3 deletions pandas/core/frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -1697,19 +1697,43 @@ def to_parquet(self, fname, engine='auto', compression='snappy',

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An extended summary is necessary.

.. versionadded:: 0.21.0

This function writes the dataframe as a `parquet file
<https://parquet.apache.org/>`_. You can choose different parquet
backends, and have the option of compression. See
:ref:`the user guide <io.parquet>` for more details.

Parameters
----------
fname : str
string file path
String file path.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC we use lower for these? @jorisvandenbossche

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here it is fine as start of the sentence and not being the exact type

engine : {'auto', 'pyarrow', 'fastparquet'}, default 'auto'
Parquet library to use. If 'auto', then the option
``io.parquet.engine`` is used. The default ``io.parquet.engine``
behavior is to try 'pyarrow', falling back to 'fastparquet' if
'pyarrow' is unavailable.
compression : {'snappy', 'gzip', 'brotli', None}, default 'snappy'
Name of the compression to use. Use ``None`` for no compression.
kwargs
Additional keyword arguments passed to the engine
**kwargs
Additional arguments passed to the parquet library. See
:ref:`pandas io <io.parquet>` for more details.

See Also
--------
read_parquet : Read a parquet file.
DataFrame.to_csv : Write a csv file.
DataFrame.to_sql : Write to a sql table.
DataFrame.to_hdf : Write to hdf.

Notes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

-----
This function requires either the `fastparquet
<https://pypi.python.org/pypi/fastparquet>`_ or `pyarrow
<https://arrow.apache.org/docs/python/>`_ library.

Examples
--------
>>> df = pd.DataFrame(data={'col1': [1, 2], 'col2': [3, 4]})
>>> df.to_parquet('df.parquet.gzip', compression='gzip')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you also show using read_parquet to read this back

"""
from pandas.io.parquet import to_parquet
to_parquet(self, fname, engine,
Expand Down