-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ExtensionArray.map #23179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
-1 on hard coding things expanding the interface is the way forward here |
-1 on expanding things needlessly though. I’d rather wait for a compelling use case to come along.
…________________________________
From: Jeff Reback <[email protected]>
Sent: Tuesday, October 16, 2018 7:13:42 AM
To: pandas-dev/pandas
Cc: Tom Augspurger; Author
Subject: Re: [pandas-dev/pandas] ExtensionArray.map (#23179)
-1 on hard coding things
expanding the interface is the way forward here
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub<#23179 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/ABQHIlJ6xk5ybMZul4tdigJ8bd5o1sP4ks5ulc12gaJpZM4XeIt8>.
|
Trying to de-duplicate is_extension_array_dtype and is_extension_type, I'm finding that the lack of EA.map is a blocker for using is_extension_array_dtype in all cases. |
FWIW, I don't think deduplicating is_extension_array_dtype
and is_extension_dtype important enough to warrant adding a new method to
the API.
…On Wed, Nov 6, 2019 at 7:31 PM jbrockmendel ***@***.***> wrote:
Trying to de-duplicate is_extension_array_dtype and is_extension_type, I'm
finding that the lack of EA.map is a blocker for using
is_extension_array_dtype in all cases.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#23179?email_source=notifications&email_token=AAKAOITWXM6223AUIINOSHLQSNOX3A5CNFSM4F3YRN6KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDIPG4Q#issuecomment-550564722>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAKAOIUF4HT534VRBJWHHMTQSNOX3ANCNFSM4F3YRN6A>
.
|
Ran into this in #39941, where Edit: I just found datetime64 also implements map which does not have the property I mentioned. |
Not exactly. For sparse you can make the return dtype always be sparse, but we can come up with UDFs that must have different sparse dtype. For Categorical you could pass your result to |
Closed. |
Both Categorical and SparseArray found implementing a
.map
method useful. This allows them to efficiently apply a function / mapping to the categories / sp_values, rather than every element of an array. We dispatch to it internally in https://github.com/pandas-dev/pandas/blob/master/pandas/core/series.py#L3379-L3380So, we need to either
Do people have a preference? Right now I'm leaning toward 2. Or are there other array types that would have a similar efficiency gain to Categorical or Sparse?
The text was updated successfully, but these errors were encountered: