Skip to content

Support str param type for partition_cols in to_parquet function #27984

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 12 commits into from

Conversation

unchecked9
Copy link

@unchecked9 unchecked9 commented Aug 18, 2019

Not sure where to add whatsnew entry #PandasHack2019

@unchecked9
Copy link
Author

@sbrice

partition_cols : list or string, optional, default None
Column names by which to partition the dataset.
Columns are partitioned in the order they are given.
String identifies a single column to be partitioned.

.. versionadded:: 0.24.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a .. versionchanged:: 1.0.0 explaining that passing a single string was added in 1.0?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I added it correctly. This is my first contribution so bear with me please :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can put the .. versionchanged:: ... right below the parameter explanation (so since there is already a ..versionadded:: 0.24.0, it would go just below that

@jorisvandenbossche jorisvandenbossche added the IO Parquet parquet, feather label Aug 21, 2019
partition_cols : list or string, optional, default None
Column names by which to partition the dataset.
Columns are partitioned in the order they are given.
String identifies a single column to be partitioned.

.. versionadded:: 0.24.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can put the .. versionchanged:: ... right below the parameter explanation (so since there is already a ..versionadded:: 0.24.0, it would go just below that

@@ -29,7 +29,7 @@ Enhancements
Other enhancements
^^^^^^^^^^^^^^^^^^

-
- String support for paramater partition_cols in the :func:`pandas.to_parquet` (:issue:`27117`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- String support for paramater partition_cols in the :func:`pandas.to_parquet` (:issue:`27117`)
- String support for paramater ``partition_cols`` in the :func:`pandas.to_parquet` (:issue:`27117`)

I would also try to make it a bit clearer to say that a string for a single column name is now also accepted instead of a list

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we really want to accept a sting here (and not a list)? i think we would rather be strict about this; though i suppose this is convenient as a partition col is often a single str

Co-Authored-By: Joris Van den Bossche <[email protected]>
@jreback
Copy link
Contributor

jreback commented Oct 6, 2019

@DalynnHatch can you merge master; the code change is ok, but will comment on the tests after rebase.

@WillAyd
Copy link
Member

WillAyd commented Nov 7, 2019

Closing as stale but @DalynnHatch ping if you'd like to pick back up and continue

@WillAyd WillAyd closed this Nov 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO Parquet parquet, feather
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make pandas.to_parquet handles partition columns better
6 participants