Skip to content

ENH: Improve Pandas scalars #383

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 30 commits into from
Closed

Conversation

bashtage
Copy link
Contributor

  • Tests added: Please use assert_type() to assert the type of any return value

@bashtage bashtage force-pushed the pandas-scalars branch 2 times, most recently from 44c5c93 to 5d21d0f Compare October 12, 2022 17:56
Copy link
Collaborator

@Dr-Irv Dr-Irv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. When checking the operators (+, -, *, >, <, etc.) for a pandas type PandasType (e.g. Period, Timestamp, etc.) need to check pandas_type_instance + other_type_instance and other_type_instance + pandas_type_instance` (Hope that makes sense)
  2. There is some overlap with what is tested in test_interval.py and test_timefuncs.py, so that should get reconciled.

@overload # type: ignore[override]
def __eq__(self, other: _PeriodEqualityComparison) -> bool: ...
@overload
def __eq__(self, other: PeriodIndex | DatetimeIndex) -> npt.NDArray[np.bool_]: ...
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the first overload should be object. You can compare a Period to any object. But then you can't make that the first overload. So swap the order, as the first overload would return npt.NDArray[np.bool_] and the second one would be for object returning bool

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tricky here. For now I've only added types that could actually be either True or False. For always False types, can get array-like[bool] when compared to array-like or bool for scalar types.

@Dr-Irv
Copy link
Collaborator

Dr-Irv commented Oct 14, 2022

@bashtage Let me know when I should review again.

@bashtage
Copy link
Contributor Author

@Dr-Irv I think it is ready for review.

Copy link
Collaborator

@Dr-Irv Dr-Irv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are still things from my previous review that need to be addressed.

This is getting pretty big, making it a challenge to review, as I'll keep finding things. It also takes a long time to review all of these changes at once. When you're confident that all my comments have been addressed, I then have to go and check that myself, and look at every line again. It's pretty time consuming. When the PR's are smaller, I can turn them around faster by finding small chunks of time here and there. I spent 90 minutes tonight just looking at changes since my last review, and added 41 comments!

I would prefer that this PR be split up into 5 separate PR's. Handle changes for each particular scalar (Interval, Period, Timestamp, Timedelta) as a separate PR for each one, and do one PR for the arrays

@@ -17,13 +17,17 @@ class IntervalArray(IntervalMixin, ExtensionArray):
cls, data, closed=..., dtype=..., copy: bool = ..., verify_integrity: bool = ...
): ...
@classmethod
def from_breaks(cls, breaks, closed: str = ..., copy: bool = ..., dtype=...): ...
def from_breaks(
cls, breaks, closed: str = ..., copy: bool = ..., dtype=...
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

closed should be IntervalClosedType

def from_breaks(cls, breaks, closed: str = ..., copy: bool = ..., dtype=...): ...
def from_breaks(
cls, breaks, closed: str = ..., copy: bool = ..., dtype=...
) -> IntervalArray: ...
@classmethod
def from_arrays(
cls, left, right, closed: str = ..., copy: bool = ..., dtype=...
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

closed should be IntervalClosedType

@classmethod
def from_tuples(cls, data, closed: str = ..., copy: bool = ..., dtype=...): ...
def from_tuples(
cls, data, closed: str = ..., copy: bool = ..., dtype=...
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

closed should be IntervalClosedType

@bashtage
Copy link
Contributor Author

@Dr-Irv I think all of the other PRs except the arrays one are ready for review.

@Dr-Irv
Copy link
Collaborator

Dr-Irv commented Oct 24, 2022

@bashtage should we close this since you split it into multiple PR's?

@bashtage bashtage closed this Oct 24, 2022
@bashtage bashtage deleted the pandas-scalars branch November 28, 2022 06:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants