-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: Remove unnecessary validation to non-string columns/index in df.to_parquet #52036
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
pandas/io/parquet.py
Outdated
"string", | ||
"empty", | ||
}: | ||
# GH 52034: RangeIndex.inferred_dtype is always "integer" if empty |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be too broad? An empty Index with int dtype should probably still raise?
E.g. Index([1], dtype="int64")
should behave the same as Index([], dtype="int64")
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm actually, do you know why we have this limitation on string columns names? pyarrow doesn't seem to have this limitation.
In [25]: tb = pa.Table.from_pandas(pd.DataFrame({1: [2]}))
In [26]: tb
Out[26]:
pyarrow.Table
1: int64
----
1: [[2]]
In [27]: pq.write_table(tb, "abc")
In [28]: pq.read_table("abc")
Out[28]:
pyarrow.Table
1: int64
----
1: [[2]]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be an old artefact (same as for read_orc that we removed a couple of days ago), so I'd be ok with getting rid of this if not necessary
thx @mroeschke |
Owee, I'm MrMeeseeks, Look at me. There seem to be a conflict, please backport manually. Here are approximate instructions:
And apply the correct labels and milestones. Congratulations — you did some good work! Hopefully your backport PR will be tested by the continuous integration and merged soon! Remember to remove the If these instructions are inaccurate, feel free to suggest an improvement. |
…n to non-string columns/index in df.to_parquet) (#52044) BUG: Remove unnecessary validation to non-string columns/index in df.to_parquet (#52036) Co-authored-by: Matthew Roeschke <[email protected]>
doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.