Skip to content

ENH: Series.str.join for ArrowDtype(pa.string()) #53646

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jun 13, 2023

Conversation

lukemanley
Copy link
Member

Adds support for Series.str.join with ArrowDtype(pa.string()):

In [1]: from pandas import Series, ArrowDtype

In [2]: import pyarrow as pa

In [3]: ser = Series(["abc", "123", None], dtype=ArrowDtype(pa.string()))

In [4]: ser.str.join("-")
Out[4]: 
0    a-b-c
1    1-2-3
2     <NA>
dtype: string[pyarrow]

This is already supported by string[python] and string[pyarrow] as well as python strings in general:

In [1]: "-".join("abc")
Out[1]: 'a-b-c'

This is work towards being able to add ArrowDtype(pa.string()) to the string benchmarks.

@lukemanley lukemanley added Strings String extension data type and string data Arrow pyarrow functionality labels Jun 13, 2023
@lukemanley lukemanley added this to the 2.1 milestone Jun 13, 2023
@mroeschke mroeschke merged commit 1f3c9bc into pandas-dev:main Jun 13, 2023
@mroeschke
Copy link
Member

Nice find thanks @lukemanley

@lukemanley lukemanley deleted the str-join-pyarrow-string branch June 13, 2023 23:27
Daquisu pushed a commit to Daquisu/pandas that referenced this pull request Jul 8, 2023
* Series.str.join to support ArrowDtype(pa.string())

* gh ref
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Arrow pyarrow functionality Strings String extension data type and string data
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants