-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
TST: [ArrowStringArray] more parameterised testing - part 3 #40755
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TST: [ArrowStringArray] more parameterised testing - part 3 #40755
Conversation
@@ -254,9 +254,9 @@ def test_replace2(self): | |||
assert (ser[6:10] == -1).all() | |||
assert (ser[20:30] == -1).all() | |||
|
|||
def test_replace_with_dictlike_and_string_dtype(self): | |||
def test_replace_with_dictlike_and_string_dtype(self, nullable_string_dtype): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
arrow_string
behaves the same as string
>>> pd.__version__
'1.3.0.dev0+1219.gccc6a707e2'
>>>
>>> s = pd.Series(["one", "two", np.nan], dtype="string")
>>>
>>> s
0 one
1 two
2 <NA>
dtype: string
>>>
>>> s.replace({"one": "1", "two": "2"})
0 1
1 2
2 <NA>
dtype: object
>>>
is this correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would say that's a bug, but that's in any case unrelated for this PR (there are a few open issues related to replace
with nullable dtypes, so it might already be covered)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See eg #40732 (for other nullable dtypes)
pandas/tests/io/test_parquet.py
Outdated
df = pd.DataFrame( | ||
{ | ||
"a": pd.Series([1, 2, 3], dtype="Int64"), | ||
"b": pd.Series([1, 2, 3], dtype="UInt32"), | ||
"c": pd.Series(["a", None, "c"], dtype="string"), | ||
"d": pd.Series(["a", None, "c"], dtype="arrow_string"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this test is @td.skip_if_no("pyarrow", min_version="0.15.0")
so safe to add here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe not ImportError: pyarrow>=1.0.0 is required for PyArrow backed StringArray.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this test has a minimum of 0.15. I would make a separate, dedicated test for ArrowStringArray
pandas/tests/io/test_parquet.py
Outdated
@@ -812,11 +812,15 @@ def test_write_with_schema(self, pa): | |||
def test_additional_extension_arrays(self, pa): | |||
# test additional ExtensionArrays that are supported through the | |||
# __arrow_array__ protocol | |||
|
|||
from pandas.core.arrays.string_arrow import ArrowStringDtype # noqa: F401 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
force registration of "arrow_string", not strictly necessary but won't be missed as ArrowStringDtype
will be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One comment about the test_parquet.py, but for the rest this PR is ready?
Thanks! |
No description provided.