-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
BUG: interchange categorical_column_to_series() should not accept only PandasColumn #52763
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
3d87f59
to
a0d3b44
Compare
@@ -74,6 +74,22 @@ def test_categorical_dtype(data): | |||
tm.assert_frame_equal(df, from_dataframe(df.__dataframe__())) | |||
|
|||
|
|||
@td.skip_if_no("pyarrow") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you use pytest.importorskip
instead?
Thank you @MarcoGorelli! LGTM +1 |
Co-authored-by: Matthew Barber <[email protected]>
Just a merge conflict |
Thanks @MarcoGorelli |
…series() should not accept only PandasColumn
…mn_to_series() should not accept only PandasColumn) (#52793) Backport PR #52763: BUG: interchange categorical_column_to_series() should not accept only PandasColumn Co-authored-by: Marco Edward Gorelli <[email protected]>
# for mypy/pyright | ||
assert isinstance(cat_column, PandasColumn), "categories must be a PandasColumn" | ||
categories = np.array(cat_column._col) | ||
if hasattr(cat_column, "_col"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the reason for using this attribute? (that also seems like relying on and exposing some internal detail)
Haven't looked in detail, but my understanding is that this object is just a protocol Column object, and we already have all the code to convert that to a pandas object?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know, this was already there - looks like it was introduced in https://github.com/pandas-dev/pandas/pull/47886/files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I know it was already there, but IMO the "proper" fix for the reported issue is to stop relying on this detail.
Because the the test you added only "happens" to work because pyarrow's Column object also uses _col
to store the original array; if we would rename that internal property (and I would say it's actually a confusing name in hindsight, since it's an array ;)), the test would start failing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup, agreed
doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.cc @AlenkaF
I've added this to the 2.0.1 whatsnew, and it's a trivial 1-line change (which was only there to begin with to help type checkers, it doesn't have any runtime purpose) and it unlocks a fair bit