Skip to content

TYP: prep _generate_range_overflow_safe for numpy 1.20 #39067

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 11 additions & 5 deletions pandas/core/arrays/_ranges.py
Original file line number Diff line number Diff line change
Expand Up @@ -69,8 +69,11 @@ def generate_regular_range(


def _generate_range_overflow_safe(
endpoint: int, periods: int, stride: int, side: str = "start"
) -> int:
endpoint: Union[int, np.integer],
periods: Union[int, np.integer],
stride: int,
side: str = "start",
) -> np.integer:
"""
Calculate the second endpoint for passing to np.arange, checking
to avoid an integer overflow. Catch OverflowError and re-raise
Expand All @@ -89,7 +92,7 @@ def _generate_range_overflow_safe(

Returns
-------
other_end : int
np.integer

Raises
------
Expand Down Expand Up @@ -136,8 +139,11 @@ def _generate_range_overflow_safe(


def _generate_range_overflow_safe_signed(
endpoint: int, periods: int, stride: int, side: str
) -> int:
endpoint: Union[int, np.integer],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we shouldn't do this specifically here, almost anywhere we accept int we also accept np.integer, e.g. thisi by using is_integer checking.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, we should use an alias in pandas._typing, however, my concern is putting that in place before we transition to mypy checking using numpy types is that there will be many more mypy errors that won't yet be reported/visible, potentially making the transition harder.

see also scipy/scipy#10844 (review) and numpy/numpy#18096

Here, was just using the Union where we internally explicitly pass a np.int to a function, for now.

I'm not convinced we should do this yet, either. so can use this PR for discussion.

also need to convince myself, that those new errors are actually numpy issues with the return types of np.integer before we use _int = Union[int, np.integer].

import numpy as np

py_int = 42
reveal_type(py_int)
reveal_type(py_int // 2)
reveal_type(py_int - 1)

np_int = np.int64(np.iinfo(np.int64).min)
reveal_type(np_int)
reveal_type(np_int // 2)
reveal_type(np_int - 1)
print(np_int - 1)

np.integer(np_int)

np_integer: np.integer
reveal_type(np_integer)
reveal_type(np_integer // 2)
reveal_type(np_integer - 1)
test.py:4: note: Revealed type is 'builtins.int'
test.py:5: note: Revealed type is 'builtins.int'
test.py:6: note: Revealed type is 'builtins.int'
test.py:9: note: Revealed type is 'numpy.signedinteger[numpy.typing._64Bit*]'
test.py:10: note: Revealed type is 'numpy.signedinteger[Any]'
test.py:11: note: Revealed type is 'numpy.signedinteger[Any]'
test.py:14: error: Cannot instantiate abstract class 'integer' with abstract attribute '__init__'  [abstract]
test.py:17: note: Revealed type is 'numpy.integer[Any]'
test.py:18: note: Revealed type is 'numpy.number[Any]'
test.py:19: note: Revealed type is 'numpy.number[Any]'

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, we should use an alias in pandas._typing, however, my concern is putting that in place before we transition to mypy checking using numpy types is that there will be many more mypy errors that won't yet be reported/visible, potentially making the transition harder.

how does not using an alias help?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess while i keep #36092 up to date, we do have some visibility.

but if we start using an alias now, while numpy types resolve to Any we may increase the number of mypy errors to resolve when transitioning to using numpy types.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so would it be better then to not type this output? avoiding both scenarios?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indeed removing some of the problematic type annotations could be a better short term solution, and add them back once we have numpy types and we have mypy errors visible to help resolve.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kk great. yeah certainly love to have more annotations, but if they are going to be false positives then we shouldn't add (now)

periods: Union[int, np.integer],
stride: int,
side: str,
) -> np.integer:
"""
A special case for _generate_range_overflow_safe where `periods * stride`
can be calculated without overflowing int64 bounds.
Expand Down