-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
API (string): str.center with pyarrow-backed string dtype #59624
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
75db35b
d9b18c9
c34b2e6
99dd312
1f2902d
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,13 +1,17 @@ | ||
from __future__ import annotations | ||
|
||
from functools import partial | ||
from typing import ( | ||
TYPE_CHECKING, | ||
Literal, | ||
) | ||
|
||
import numpy as np | ||
|
||
from pandas.compat import pa_version_under10p1 | ||
from pandas.compat import ( | ||
pa_version_under10p1, | ||
pa_version_under17p0, | ||
) | ||
|
||
from pandas.core.dtypes.missing import isna | ||
|
||
|
@@ -49,7 +53,19 @@ def _str_pad( | |
elif side == "right": | ||
pa_pad = pc.utf8_rpad | ||
elif side == "both": | ||
pa_pad = pc.utf8_center | ||
if pa_version_under17p0: | ||
# GH#59624 fall back to object dtype | ||
from pandas import array | ||
|
||
obj_arr = self.astype(object, copy=False) # type: ignore[attr-defined] | ||
obj = array(obj_arr, dtype=object) | ||
result = obj._str_pad(width, side, fillchar) # type: ignore[attr-defined] | ||
return type(self)._from_sequence(result, dtype=self.dtype) # type: ignore[attr-defined] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not sure how significant this is, but one small performance disadvantage of this instead of The problem is that we don't know here which pyarrow type to use? But then could do something like There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. im not inclined to worry about this too much since its in a branch that will be going away eventually |
||
else: | ||
# GH#54792 | ||
# https://github.com/apache/arrow/issues/15053#issuecomment-2317032347 | ||
lean_left = (width % 2) == 0 | ||
pa_pad = partial(pc.utf8_center, lean_left_on_odd_padding=lean_left) | ||
else: | ||
raise ValueError( | ||
f"Invalid side: {side}. Side must be one of 'left', 'right', 'both'" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could also do
obj_arr.array._str_pad(..)
and then don't need this explicit construction?