Skip to content

DOC: update the parquet docstring #20129

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Mar 12, 2018
23 changes: 20 additions & 3 deletions pandas/core/frame.py
Original file line number Diff line number Diff line change
Expand Up @@ -1697,19 +1697,36 @@ def to_parquet(self, fname, engine='auto', compression='snappy',

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An extended summary is necessary.

.. versionadded:: 0.21.0

Requires either fastparquet or pyarrow libraries.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that it should go in the Notes section.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Third-persion verbs should not be used, as stated in docs guide. There are 2 options here: either use a subject like "This function requires" or begin with the infinite "Require..."

I'm not 100% sure, but I have a feeling that "libraries" should be in singular the form "library".


Parameters
----------
fname : str
string file path
String file path.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC we use lower for these? @jorisvandenbossche

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here it is fine as start of the sentence and not being the exact type

engine : {'auto', 'pyarrow', 'fastparquet'}, default 'auto'
Parquet library to use. If 'auto', then the option
``io.parquet.engine`` is used. The default ``io.parquet.engine``
behavior is to try 'pyarrow', falling back to 'fastparquet' if
'pyarrow' is unavailable.
compression : {'snappy', 'gzip', 'brotli', None}, default 'snappy'
Name of the compression to use. Use ``None`` for no compression.
kwargs
Additional keyword arguments passed to the engine
kwargs : dict
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you change this line to **kwargs (without the dict, as you actually cannot pass it as a dict)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Writing it as **kwargs results in a validation error:

Errors found:
	Errors in parameters section
		Parameters {'kwargs'} not documented
		Unknown parameters {'**kwargs'}
		Parameter "**kwargs" has no type

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can ignore that error (the validation script is not yet perfect here :-))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for clarifying!

Additional keyword arguments passed to the engine.

Examples
----------

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are to many hyphens.

>>> df = pd.DataFrame(data={'col1': [1, 2], 'col2': [3, 4]})
>>> df.to_parquet('df.parquet.gzip', compression='gzip')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you also show using read_parquet to read this back


Returns
----------

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are to many hyphens.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make the underlines the same length as the title

Nothing.

See Also
--------
DataFrame.to_csv : write a csv file.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

link to pd.read_parquet

DataFrame.to_sql : write to a sql table.
DataFrame.to_hdf : write to hdf.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the "See Also" section, please check the section 5 of https://python-sprints.github.io/pandas/guide/pandas_docstring.html

"""
from pandas.io.parquet import to_parquet
to_parquet(self, fname, engine,
Expand Down