-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Fix inference for fixed with numpy strings with arrow string option #54496
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@jbrockmendel gentle ping |
pa = pytest.importorskip("pyarrow") | ||
expected = Series(["a", "b"], dtype=pd.ArrowDtype(pa.string())) | ||
with pd.option_context("future.infer_string", True): | ||
ser = Series(np.array(["a", "b"])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be good to specify dtype explictly here.
Not sure how numpy's string dtype work is coming along, but just in case that becomes the default, it would be good to future-proof this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to keep this, we should infer the default type as arrow backed strings for now, that's the mos common use case for users
{"a": ["a", "b"], "b": ["c", "d"]}, | ||
dtype=dtype, | ||
columns=Index(["a", "b"], dtype=dtype), | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a test for a 2-D string ndarray?
e.g.
a = np.array([['a', 'b'], ['c', 'd']])
df = pd.DataFrame(a)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
Moving off milestone for now. If Brock reviews in time, can always put it back. |
@jbrockmendel ping |
…ngs with arrow string option
…y strings with arrow string option) (#54672) Backport PR #54496: Fix inference for fixed with numpy strings with arrow string option Co-authored-by: Patrick Hoefler <[email protected]>
doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.