Skip to content

Commit a90fbc8

Browse files
authored
PERF: Series.str.get for pyarrow-backed strings (#53152)
* PERF: Series.str.get for pyarrow-backed strings * whatsnew * whatsnew * whatsnew
1 parent 9bd8b5d commit a90fbc8

File tree

2 files changed

+3
-2
lines changed

2 files changed

+3
-2
lines changed

doc/source/whatsnew/v2.1.0.rst

+1
Original file line numberDiff line numberDiff line change
@@ -292,6 +292,7 @@ Performance improvements
292292
- Performance improvement in :meth:`DataFrame.loc` when selecting rows and columns (:issue:`53014`)
293293
- Performance improvement in :meth:`Series.add` for pyarrow string and binary dtypes (:issue:`53150`)
294294
- Performance improvement in :meth:`Series.corr` and :meth:`Series.cov` for extension dtypes (:issue:`52502`)
295+
- Performance improvement in :meth:`Series.str.get` for pyarrow-backed strings (:issue:`53152`)
295296
- Performance improvement in :meth:`Series.to_numpy` when dtype is a numpy float dtype and ``na_value`` is ``np.nan`` (:issue:`52430`)
296297
- Performance improvement in :meth:`~arrays.ArrowExtensionArray.to_numpy` (:issue:`52525`)
297298
- Performance improvement when doing various reshaping operations on :class:`arrays.IntegerArrays` & :class:`arrays.FloatingArray` by avoiding doing unnecessary validation (:issue:`53013`)

pandas/core/arrays/arrow/array.py

+2-2
Original file line numberDiff line numberDiff line change
@@ -1902,8 +1902,8 @@ def _str_get(self, i: int):
19021902
selected = pc.utf8_slice_codeunits(
19031903
self._pa_array, start=start, stop=stop, step=step
19041904
)
1905-
result = pa.array([None] * self._pa_array.length(), type=self._pa_array.type)
1906-
result = pc.if_else(not_out_of_bounds, selected, result)
1905+
null_value = pa.scalar(None, type=self._pa_array.type)
1906+
result = pc.if_else(not_out_of_bounds, selected, null_value)
19071907
return type(self)(result)
19081908

19091909
def _str_join(self, sep: str):

0 commit comments

Comments
 (0)