-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
DOC: update the parquet docstring #20129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
pandas/core/frame.py
Outdated
@@ -1697,19 +1697,36 @@ def to_parquet(self, fname, engine='auto', compression='snappy', | |||
|
|||
.. versionadded:: 0.21.0 | |||
|
|||
Requires either fastparquet or pyarrow libraries. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that it should go in the Notes section.
@@ -1697,19 +1697,36 @@ def to_parquet(self, fname, engine='auto', compression='snappy', | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An extended summary is necessary.
pandas/core/frame.py
Outdated
Additional keyword arguments passed to the engine. | ||
|
||
Examples | ||
---------- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are to many hyphens.
pandas/core/frame.py
Outdated
>>> df.to_parquet('df.parquet.gzip', compression='gzip') | ||
|
||
Returns | ||
---------- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are to many hyphens.
pandas/core/frame.py
Outdated
@@ -1697,19 +1697,36 @@ def to_parquet(self, fname, engine='auto', compression='snappy', | |||
|
|||
.. versionadded:: 0.21.0 | |||
|
|||
Requires either fastparquet or pyarrow libraries. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Third-persion verbs should not be used, as stated in docs guide. There are 2 options here: either use a subject like "This function requires" or begin with the infinite "Require..."
I'm not 100% sure, but I have a feeling that "libraries" should be in singular the form "library".
pandas/core/frame.py
Outdated
-------- | ||
DataFrame.to_csv : write a csv file. | ||
DataFrame.to_sql : write to a sql table. | ||
DataFrame.to_hdf : write to hdf. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the "See Also" section, please check the section 5 of https://python-sprints.github.io/pandas/guide/pandas_docstring.html
pandas/core/frame.py
Outdated
@@ -1697,19 +1697,42 @@ def to_parquet(self, fname, engine='auto', compression='snappy', | |||
|
|||
.. versionadded:: 0.21.0 | |||
|
|||
This function writes the dataframe as a parquet file. You |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a link to the parquet docs (copy from io.rst) in References
Parameters | ||
---------- | ||
fname : str | ||
string file path | ||
String file path. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC we use lower for these? @jorisvandenbossche
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here it is fine as start of the sentence and not being the exact type
pandas/core/frame.py
Outdated
Additional keyword arguments passed to the engine. | ||
|
||
Returns | ||
---------- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make the underlines the same length as the title
pandas/core/frame.py
Outdated
DataFrame.to_sql : write to a sql table. | ||
DataFrame.to_hdf : write to hdf. | ||
|
||
Notes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
pandas/core/frame.py
Outdated
|
||
See Also | ||
-------- | ||
DataFrame.to_csv : write a csv file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
link to pd.read_parquet
pandas/core/frame.py
Outdated
engine : {'auto', 'pyarrow', 'fastparquet'}, default 'auto' | ||
Parquet library to use. If 'auto', then the option | ||
``io.parquet.engine`` is used. The default ``io.parquet.engine`` | ||
behavior is to try 'pyarrow', falling back to 'fastparquet' if | ||
'pyarrow' is unavailable. | ||
compression : {'snappy', 'gzip', 'brotli', None}, default 'snappy' | ||
Name of the compression to use. Use ``None`` for no compression. | ||
kwargs | ||
Additional keyword arguments passed to the engine | ||
kwargs : dict |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you change this line to **kwargs
(without the dict, as you actually cannot pass it as a dict)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Writing it as **kwargs results in a validation error:
Errors found:
Errors in parameters section
Parameters {'kwargs'} not documented
Unknown parameters {'**kwargs'}
Parameter "**kwargs" has no type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can ignore that error (the validation script is not yet perfect here :-))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for clarifying!
pandas/core/frame.py
Outdated
Additional keyword arguments passed to the engine | ||
kwargs : dict | ||
Additional keyword arguments passed to the parquet library. See | ||
the documentation for :func:`pandas.io.parquet.to_parquet` for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you change this link to :ref:`io.parquet`
(then it will link to the user guide with more information, the link you added now is to the docstring of to_parquet
)
pandas/core/frame.py
Outdated
|
||
Notes | ||
----- | ||
This function requires either the fastparquet or pyarrow library. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you make fastparquet and pyarrow into links to their home page?
The syntax is `pyarrow <https:// ...>`__
Hello @benman1! Thanks for updating the PR. Cheers ! There are no PEP8 issues in this Pull Request. 🍻 Comment last updated on March 12, 2018 at 11:03 Hours UTC |
@benman1 I removed the Returns section, as I think it is not needed for the write functions (and we can ignore the error in the validation script) |
Examples | ||
-------- | ||
>>> df = pd.DataFrame(data={'col1': [1, 2], 'col2': [3, 4]}) | ||
>>> df.to_parquet('df.parquet.gzip', compression='gzip') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you also show using read_parquet to read this back
Did the small edit before merging. |
Checklist for the pandas documentation sprint (ignore this if you are doing
an unrelated PR):
scripts/validate_docstrings.py <your-function-or-method>
git diff upstream/master -u -- "*.py" | flake8 --diff
python doc/make.py --single <your-function-or-method>
Please include the output of the validation script below between the "```" ticks: