-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
API/DOC: an ExtensionDtype.__from_arrow__ method to convert pyarrow.Array into ExtensionArray #29229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API/DOC: an ExtensionDtype.__from_arrow__ method to convert pyarrow.Array into ExtensionArray #29229
Conversation
…rray into ExtensionArray
Would this make more sense to be defined on the |
@TomAugspurger asked the same question on the issue: #20612 (comment), where I gave some arguments. I am certainly not tied to it, but reasons I prefer the dtype:
|
lgtm - we already construct from the dtype internally so this is appropriate |
Note: this PR is ready from my part, but let's leave it open until the feature is actually merged in Arrow (based on feedback there, details might still change). |
OK, this is merged in arrow now. So merging here as well then. |
…rray into ExtensionArray (pandas-dev#29229)
…rray into ExtensionArray (pandas-dev#29229)
…rray into ExtensionArray (pandas-dev#29229)
xref the discussion in #20612, and a companion to my PR in Arrow: apache/arrow#5512
Summary: to support ExtensionArrays in the conversion of an arrow table into a pandas DataFrame, we need some way to convert pyarrow Arrays into pandas ExtensionArrays, given a certain pandas dtype (in pyarrow, we can for example know the resulting dtype from the stored metadata).
For that, I propose to add the
ExtensionDtype.__from_arrow__
method, with the following signature:Note: I only added documentation about it (which should still be expanded) for now, and not a method in the base class (eg a NotImplementedError), because in pyarrow we use
hasattr
to see if this is supported (see the linked arrow PR).