-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
REF: move sharable methods to ExtensionIndex #30717
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Is the goal to provide an index class that can work with arbitrary EAs? If so, I think that ExtensionIndex should only be using methods and attributes defined on ExtensionArray. In particular, I don't think that |
I guess this is ambiguous. the base EA class does have _ndarray_values, but the docstring says "This method is not part of the pandas interface" |
Mmm, that's a bit tricky then. Perhaps this will help us pin down some of the required semantics on Longer-term, what are you plans for ExtensionIndex? Will it be a base class that 3rd parties can inherit from and customize? |
That's an option. Ideally I'd like to keep the customization inside the EAs. ATM I'm working on smoothing out the small differences between e.g. DatetimeIndex.searchsorted vs DatetimeArray.searchsorted so we can both delegate more and improve internal consistency. |
Sounds good. Happy to continue finding common methods and develop on interface out of that. Can you check the ASVs for our EA-backed indexes on this branch? |
It looks like IntervalIndexing.time_getitem_list and time_loc_list are significantly slower, will look into this |
Looks like the slowdown was caused by using |
can you rebase |
thanks @jbrockmendel |
|
||
if self.hasnans: | ||
return self._shallow_copy(self._data[~self._isnan]) | ||
return self._shallow_copy() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are you overwriting the base Index one?
Also, this dropped the docstring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the base class uses ._values, where we want ._data here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But _values
and _data
is the same?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess. Past-me must have thought it not-obvious that this would always hold. If it can be removed, go for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you also see my docstring comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did. In this case I think removing the method makes sense. More generally I wonder if we can use a metaclass or something to automatically inherit docstrings and remove a lot of boilerplate (cc @bashtage IIRC you do something like this in arch)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe open a new issue to see if we can do this smarter?
But for 1.0.0, I would just add back the docstring
def __getitem__(self, key): | ||
result = self._data[key] | ||
if isinstance(result, type(self._data)): | ||
return type(self)(result, name=self.name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we use a faster constructor (simple_new ?) when we just want to wrap the correct type of ExtensionArray in the index?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that'd work. IIRC there were some corner cases involving CategoricalIndex.dtype, not sure if those are relevant here
No description provided.