-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ENH: Allow ArrowDtype(pa.string()) to be compatable with str accessor #51207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 13 commits
f8930b4
0c31e1e
f8aa6e5
092f58f
10ad912
0ae2e1d
37d0add
4e6d24e
1444393
c74473f
d7a463f
bca214b
62d2f58
6455214
8808c88
b57f142
852920d
53d2083
977b698
2014a06
5bb7d2a
0d15998
112e185
8b82b1b
5d7ff0c
29a5cbb
a3f53fd
e24a3ea
35e35e3
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -56,6 +56,7 @@ | |
is_numeric_dtype, | ||
is_period_dtype, | ||
is_sparse, | ||
is_string_dtype, | ||
is_timedelta64_dtype, | ||
needs_i8_conversion, | ||
) | ||
|
@@ -76,7 +77,6 @@ | |
BaseMaskedArray, | ||
BaseMaskedDtype, | ||
) | ||
from pandas.core.arrays.string_ import StringDtype | ||
from pandas.core.frame import DataFrame | ||
from pandas.core.groupby import grouper | ||
from pandas.core.indexes.api import ( | ||
|
@@ -96,6 +96,7 @@ | |
) | ||
|
||
if TYPE_CHECKING: | ||
from pandas.core.arrays.string_ import StringDtype | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. the annotation that uses this may no longer be accurate? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good point. Added to |
||
from pandas.core.generic import NDFrame | ||
|
||
|
||
|
@@ -387,7 +388,7 @@ def _ea_to_cython_values(self, values: ExtensionArray) -> np.ndarray: | |
# All of the functions implemented here are ordinal, so we can | ||
# operate on the tz-naive equivalents | ||
npvalues = values._ndarray.view("M8[ns]") | ||
elif isinstance(values.dtype, StringDtype): | ||
elif is_string_dtype(values.dtype): | ||
# StringArray | ||
npvalues = values.to_numpy(object, na_value=np.nan) | ||
else: | ||
|
@@ -405,7 +406,7 @@ def _reconstruct_ea_result( | |
""" | ||
dtype: BaseMaskedDtype | StringDtype | ||
|
||
if isinstance(values.dtype, StringDtype): | ||
if is_string_dtype(values.dtype): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. doesnt make a difference here bc we know we have EA, but we should have a func like is_string_dtype that excludes object |
||
dtype = values.dtype | ||
string_array_cls = dtype.construct_array_type() | ||
return string_array_cls._from_sequence(res_values, dtype=dtype) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i see this is already in the string_arrow code, but why "//$"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure, as you mentioned I just copied it over from the string_arrow code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. Best guess is it should be backslashes, meant to exclude escaped dollar signs. OK to consider out of scope.