-
-
Notifications
You must be signed in to change notification settings - Fork 18.6k
MNT: Bump dev pin on NumPy #60987
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MNT: Bump dev pin on NumPy #60987
Changes from 10 commits
658c4d5
e424a96
912bc1b
e75abfe
856a52f
a8304b6
41265f3
9c2f5ac
f6a2330
83e2644
8fd50b4
f7ef882
347b865
17ec834
65e1374
14d34a7
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -415,7 +415,7 @@ def unique(values): | |
|
||
>>> pd.unique(pd.array([1 + 1j, 2, 3])) | ||
<NumpyExtensionArray> | ||
[(1+1j), (2+0j), (3+0j)] | ||
[np.complex128(1+1j), np.complex128(2+0j), np.complex128(3+0j)] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I merged #54268 a while back to make these reprs render like the Python scalars/pre numpy 2.0. It appears that PR didn't touch all the relevant repr methods There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'll do a precursor to fix the reprs for arrays, and then revert some of the outputs here. Your comment only applies to EA reprs? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would say any repr in pandas shouldn't be showing the NEP 51 style repr for scalars. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
The main motivation behind NEP 51 is that the Python numerical types behave differently from the NumPy scalars. They give a few examples such as
So I think we should be showing the NEP 51 style repr for scalars. @mroeschke you have approved this PR so I assume that was a typo? |
||
Length: 3, dtype: complex128 | ||
""" | ||
return unique_with_mask(values) | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -941,7 +941,7 @@ def argmin(self, skipna: bool = True) -> int: | |
-------- | ||
>>> arr = pd.array([3, 1, 2, 5, 4]) | ||
>>> arr.argmin() | ||
1 | ||
np.int64(1) | ||
""" | ||
# Implementer note: You have two places to override the behavior of | ||
# argmin. | ||
|
@@ -975,7 +975,7 @@ def argmax(self, skipna: bool = True) -> int: | |
-------- | ||
>>> arr = pd.array([3, 1, 2, 5, 4]) | ||
>>> arr.argmax() | ||
3 | ||
np.int64(3) | ||
""" | ||
# Implementer note: You have two places to override the behavior of | ||
# argmax. | ||
|
@@ -1072,7 +1072,7 @@ def interpolate( | |
... limit_area="inside", | ||
... ) | ||
<NumpyExtensionArray> | ||
[0.0, 1.0, 2.0, 3.0] | ||
[np.float64(0.0), np.float64(1.0), np.float64(2.0), np.float64(3.0)] | ||
Length: 4, dtype: float64 | ||
|
||
Interpolating values in a FloatingArray: | ||
|
@@ -1962,7 +1962,7 @@ def _formatter(self, boxed: bool = False) -> Callable[[Any], str | None]: | |
... return lambda x: "*" + str(x) + "*" if boxed else repr(x) + "*" | ||
>>> MyExtensionArray(np.array([1, 2, 3, 4])) | ||
<MyExtensionArray> | ||
[1*, 2*, 3*, 4*] | ||
[np.int64(1)*, np.int64(2)*, np.int64(3)*, np.int64(4)*] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. NEP 51 is for the output of numpy scalars to help users distinguish the NumPy scalars from the Python builtin types and clarify their behavior and that array representation will not be affected since it already includes the dtype= when necessary. So should probably change the example here to not use the NEP 51 repr in the output? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree with Simon. Historically I don't think we have ever favored exposing NumPy scalars directly to end users. If that has happened as an implementation detail that is one thing, but explicitly showing that to an end user runs counter to what we have done in the psat There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Updated. |
||
Length: 4, dtype: int64 | ||
""" | ||
if boxed: | ||
|
@@ -2176,15 +2176,15 @@ def _reduce( | |
Examples | ||
-------- | ||
>>> pd.array([1, 2, 3])._reduce("min") | ||
1 | ||
np.int64(1) | ||
>>> pd.array([1, 2, 3])._reduce("max") | ||
3 | ||
np.int64(3) | ||
>>> pd.array([1, 2, 3])._reduce("sum") | ||
6 | ||
np.int64(6) | ||
>>> pd.array([1, 2, 3])._reduce("mean") | ||
2.0 | ||
np.float64(2.0) | ||
>>> pd.array([1, 2, 3])._reduce("median") | ||
2.0 | ||
np.float64(2.0) | ||
""" | ||
meth = getattr(self, name, None) | ||
if meth is None: | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -558,7 +558,7 @@ def array(self) -> ExtensionArray: | |
|
||
>>> pd.Series([1, 2, 3]).array | ||
<NumpyExtensionArray> | ||
[1, 2, 3] | ||
[np.int64(1), np.int64(2), np.int64(3)] | ||
Length: 3, dtype: int64 | ||
|
||
For extension types, like Categorical, the actual ExtensionArray | ||
|
@@ -804,9 +804,9 @@ def argmax( | |
dtype: float64 | ||
|
||
>>> s.argmax() | ||
2 | ||
np.int64(2) | ||
>>> s.argmin() | ||
0 | ||
Comment on lines
806
to
-809
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There are a few of these where I'm wondering if we should be returning Python scalars instead of NumPy. Should issues be opened for these? cc @pandas-dev/pandas-core There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think generally we always want to return Python scalars (IIRC we got a lot of issues about this in iteration and iteration-like APIs in the past) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Even just wrapping the result of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree we should always return Python scalars. I'm surprised at the amount of failures that expect NumPy scalars There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd think you need a deprecation on this, because people may have code that depends on the result being a numpy scalar. I think that the tests we have in There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We could put it up behind a |
||
np.int64(0) | ||
|
||
The maximum cereal calories is the third element and | ||
the minimum cereal calories is the first element, | ||
|
@@ -1360,7 +1360,7 @@ def factorize( | |
dtype: int64 | ||
|
||
>>> ser.searchsorted(4) | ||
3 | ||
np.int64(3) | ||
|
||
>>> ser.searchsorted([0, 4]) | ||
array([0, 3]) | ||
|
@@ -1379,7 +1379,7 @@ def factorize( | |
dtype: datetime64[s] | ||
|
||
>>> ser.searchsorted('3/14/2000') | ||
3 | ||
np.int64(3) | ||
|
||
>>> ser = pd.Categorical( | ||
... ['apple', 'bread', 'bread', 'cheese', 'milk'], ordered=True | ||
|
Uh oh!
There was an error while loading. Please reload this page.