fix: use fastpath for PyCapsule export when starting from pyarrow-backed Series, respect requested_schema #59683

MarcoGorelli · 2024-09-02T09:23:41Z

Follow-up to https://github.com/pandas-dev/pandas/pull/59587/files (so, I haven't added a new whatsnew)

closes #xxxx (Replace xxxx with the GitHub issue number)
Tests added and passed if fixing a bug or adding a new feature
All code checks passed.
Added type annotations to new arguments/methods/functions.
Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

…ked Series, respect requested_schema

MarcoGorelli · 2024-09-02T09:29:33Z

@jorisvandenbossche @WillAyd fancy taking a look?

jorisvandenbossche

Thanks for the follow-up!

jorisvandenbossche · 2024-09-02T18:58:24Z

pandas/core/series.py

+        if isinstance(self.dtype, ArrowDtype):
+            # fastpath!
+            ca = self.values._pa_array
+            if type is not None:
+                ca = ca.cast(type)
+        else:
+            ca = pa.chunked_array([pa.Array.from_pandas(self, type=type)])


This does not yet handle StringDtype (which is not an ArrowDtype).

Alternative option is something like (untested):

ca = pa.array(self, type=type) if not isinstance(ca, pa.ChunkedArray): ca = pa.chunked_array([ca])

Note: 1) it's a bit strange, but pa.array() can return a chunked array (in this case, it will check the __arrow_array__ attribute on the passed object, and for our dtypes backed by a chunked array, that method will return the chunked array, and this is directly passed through as result of pa.array()), and 2) pa.array() is essentially just the same as pa.Array.from_pandas but a bit simpler (and because we rely on it returning potentially a chunked array, it would read even more strangely ..)

that seems to work, thanks @jorisvandenbossche !

mroeschke · 2024-09-09T17:36:59Z

Thanks for the follow up @MarcoGorelli

…ked Series, respect requested_schema (pandas-dev#59683) * fix: use fastpath for PyCapsule export when starting from pyarrow-backed Series, respect requested_schema * simplify * stringdtype test

fix: use fastpath for PyCapsule export when starting from pyarrow-bac…

a1bf651

…ked Series, respect requested_schema

MarcoGorelli changed the title ~~fix: use fastpath for PyCapsule export when starting from pyarrow-bac…~~ fix: use fastpath for PyCapsule export when starting from pyarrow-backed Series, respect requested_schema Sep 2, 2024

MarcoGorelli marked this pull request as ready for review September 2, 2024 14:52

jorisvandenbossche reviewed Sep 2, 2024

View reviewed changes

mroeschke added the Arrow pyarrow functionality label Sep 3, 2024

MarcoGorelli added 3 commits September 6, 2024 13:43

Merge remote-tracking branch 'upstream/main' into fix-pycapsule

a7f5fd0

simplify

04871d4

stringdtype test

bf33163

WillAyd approved these changes Sep 6, 2024

View reviewed changes

mroeschke added this to the 3.0 milestone Sep 9, 2024

mroeschke approved these changes Sep 9, 2024

View reviewed changes

mroeschke merged commit 871703d into pandas-dev:main Sep 9, 2024
47 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: use fastpath for PyCapsule export when starting from pyarrow-backed Series, respect requested_schema #59683

fix: use fastpath for PyCapsule export when starting from pyarrow-backed Series, respect requested_schema #59683

MarcoGorelli commented Sep 2, 2024 •

edited

Loading

MarcoGorelli commented Sep 2, 2024

jorisvandenbossche left a comment

jorisvandenbossche Sep 2, 2024

MarcoGorelli Sep 6, 2024

mroeschke commented Sep 9, 2024

fix: use fastpath for PyCapsule export when starting from pyarrow-backed Series, respect requested_schema #59683

fix: use fastpath for PyCapsule export when starting from pyarrow-backed Series, respect requested_schema #59683

Conversation

MarcoGorelli commented Sep 2, 2024 • edited Loading

MarcoGorelli commented Sep 2, 2024

jorisvandenbossche left a comment

Choose a reason for hiding this comment

jorisvandenbossche Sep 2, 2024

Choose a reason for hiding this comment

MarcoGorelli Sep 6, 2024

Choose a reason for hiding this comment

mroeschke commented Sep 9, 2024

MarcoGorelli commented Sep 2, 2024 •

edited

Loading