-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
DOC/TYP: index.take return val #40521
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DOC/TYP: index.take return val #40521
Conversation
allow_fill: bool = True, | ||
fill_value=None, | ||
**kwargs, | ||
): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any advice on typing this return value? Can almost use the TypeVar _IndexT
, but the problem is that RangeIndex.take
returns a superclass (Int64Index
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in the base class use Index
, and can then use a typevar in the subclasses that do return the same type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried something along those lines originally - the problem I was running into is how to handle subclasses which rely on parent implementations. For example, for ExtensionIndex
would like to be able to write a stub like
def take(
self: _ExtensionIndexT,
indices,
axis: int = 0,
allow_fill: bool = True,
fill_value=None,
**kwargs,
) -> _ExtensionIndexT: ...
to essentially type the child while still relying on the inherited implementation. But couldn't figure out a way to achieve this.
pandas/core/indexes/base.py
Outdated
def take(self, indices, axis=0, allow_fill=True, fill_value=None, **kwargs): | ||
def take( | ||
self, | ||
indices: Union[ArrayLike, Sequence[int]], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is ArrayLike really needed here? it should also be a Sequence[int]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see #28770 for issue about EA, but applicable to Series, Index and np.ndarray also
from typing import Sequence, cast
import numpy as np
import pandas as pd
from pandas.core.arrays import ExtensionArray
df = pd.DataFrame({"a": [1, 2, 3]})
idx = df.index
arr: np.ndarray = cast(np.ndarray, idx._values)
ea: ExtensionArray = pd.array(arr, dtype="Int64")
ser = pd.Series(arr)
reveal_type(idx)
reveal_type(arr)
reveal_type(ea)
reveal_type(ser)
def func(arr: Sequence):
pass
func(idx)
func(arr)
func(ea)
func(ser)
/home/simon/t.py:11: note: Revealed type is 'pandas.core.indexes.base.Index'
/home/simon/t.py:12: note: Revealed type is 'numpy.ndarray'
/home/simon/t.py:13: note: Revealed type is 'pandas.core.arrays.base.ExtensionArray'
/home/simon/t.py:14: note: Revealed type is 'pandas.core.series.Series'
/home/simon/t.py:21: error: Argument 1 to "func" has incompatible type "Index"; expected "Sequence[Any]" [arg-type]
/home/simon/t.py:22: error: Argument 1 to "func" has incompatible type "ndarray"; expected "Sequence[Any]" [arg-type]
/home/simon/t.py:23: error: Argument 1 to "func" has incompatible type "ExtensionArray"; expected "Sequence[Any]" [arg-type]
/home/simon/t.py:24: error: Argument 1 to "func" has incompatible type "Series"; expected "Sequence[Any]" [arg-type]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @mzeitlin11 for the PR. generally lgtm, a couple of suggestions.
pandas/core/indexes/base.py
Outdated
def take(self, indices, axis=0, allow_fill=True, fill_value=None, **kwargs): | ||
def take( | ||
self, | ||
indices: Union[ArrayLike, Sequence[int]], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see #28770 for issue about EA, but applicable to Series, Index and np.ndarray also
from typing import Sequence, cast
import numpy as np
import pandas as pd
from pandas.core.arrays import ExtensionArray
df = pd.DataFrame({"a": [1, 2, 3]})
idx = df.index
arr: np.ndarray = cast(np.ndarray, idx._values)
ea: ExtensionArray = pd.array(arr, dtype="Int64")
ser = pd.Series(arr)
reveal_type(idx)
reveal_type(arr)
reveal_type(ea)
reveal_type(ser)
def func(arr: Sequence):
pass
func(idx)
func(arr)
func(ea)
func(ser)
/home/simon/t.py:11: note: Revealed type is 'pandas.core.indexes.base.Index'
/home/simon/t.py:12: note: Revealed type is 'numpy.ndarray'
/home/simon/t.py:13: note: Revealed type is 'pandas.core.arrays.base.ExtensionArray'
/home/simon/t.py:14: note: Revealed type is 'pandas.core.series.Series'
/home/simon/t.py:21: error: Argument 1 to "func" has incompatible type "Index"; expected "Sequence[Any]" [arg-type]
/home/simon/t.py:22: error: Argument 1 to "func" has incompatible type "ndarray"; expected "Sequence[Any]" [arg-type]
/home/simon/t.py:23: error: Argument 1 to "func" has incompatible type "ExtensionArray"; expected "Sequence[Any]" [arg-type]
/home/simon/t.py:24: error: Argument 1 to "func" has incompatible type "Series"; expected "Sequence[Any]" [arg-type]
allow_fill: bool = True, | ||
fill_value=None, | ||
**kwargs, | ||
): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in the base class use Index
, and can then use a typevar in the subclasses that do return the same type.
Co-authored-by: Simon Hawkins <[email protected]>
Co-authored-by: Simon Hawkins <[email protected]>
…nto doc/index_take_meth
pandas/core/indexes/base.py
Outdated
def take(self, indices, axis=0, allow_fill=True, fill_value=None, **kwargs): | ||
def take( | ||
self, | ||
indices: Union[AnyArrayLike, Sequence[int]], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isnt Sequence[int] going to cover the AnyArrayLike case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think even with AnyArrayLike
in place of ArrayLike
we run into the issue mentioned here #40521 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is too complicated and specific an annotation to put here, leave this untyped for now, or put it in a typing alias, but then we should enforce it, alt we shouldn't allow non-array-likes here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left untyped for now, now mostly just a doc fix
@@ -886,7 +886,7 @@ def astype(self, dtype, copy=True): | |||
|
|||
Parameters | |||
---------- | |||
indices : list | |||
indices : array-like |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think Sequence[int] would be more helpful
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to change, my only question would be if we want to be using type hints for user facing API. The numpy documentation for take
just uses array_like
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think Sequence[int] would be more helpful
we shouldn't use typing notation for docstrings. https://pandas.pydata.org/pandas-docs/dev/development/contributing_docstring.html#parameter-types
and from that guide...
If the exact type is not relevant, but must be compatible with a NumPy array, array-like can be specified. If Any type that can be iterated is accepted, iterable can be used:
but the guide doesn't mention sequence.
maybe sequence of int
is allowed, but I think matching numpy is fine.
pandas/core/indexes/base.py
Outdated
def take(self, indices, axis=0, allow_fill=True, fill_value=None, **kwargs): | ||
def take( | ||
self, | ||
indices: Union[AnyArrayLike, Sequence[int]], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is too complicated and specific an annotation to put here, leave this untyped for now, or put it in a typing alias, but then we should enforce it, alt we shouldn't allow non-array-likes here
@jbrockmendel can you take another look, @jreback comments have been addressed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks @mzeitlin11 |
xref #40513 (comment)