-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
TYPING: change to FrameOrSeries Alias in pandas._typing #29480
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
What is the problem this solves? Looks OK so no objections if you think it's a better approach |
didn't realize how few uses of FrameOrSeries we had in master until looked at this. so reluctant to change and then potentially need to change back again. thinking that as we review, if we think FrameOrSeries should be generic in a class or function and it causes unexpected mypy issues we could suggest using If we find this definition more useful, we could then use this as the global definition and use TypeVar("FrameOrSeries", bound="NDFrame") at the module level if necessary (i.e. core.generic) |
Sure; this is still active development so no objections to moving things around if you find easier. The only thing I don't want to do is put a looser definition like |
agreed. if necessary we can just use the Union as needed. However, not sure about return types in these cases. The recommended approach is not to return Union types. Whereas NDFrame does not permit narrowing between Series/DataFrame. as you said, active development, lets see if this becomes an issue. |
General question for @simonjayhawkins and @WillAyd: am I the only one having trouble with the distinction between NDFrame vs FrameOrSeries vs Union["DataFrame", "Series"] or whatever else? If its just me then ill set aside some time to get informed, but if others are having issues, it may merit discussion on the mailing list or a call. |
No it can be confusing. Very loosely the difference between: def some_function(obj: FrameOrSeries) -> FrameOrSeries:
... and def some_function(obj: Union[Series, DataFrame]) -> Union[Series, DataFrame]:
... Is that with the former the type that goes in and out of the function is homogenous, i.e. if you pass a DataFrame in you get a DataFrame out, if you pass a Series in you get a Series out. With the latter, basically the cartesian product of types is allowable, i.e. you can pass a DataFrame in and get either a Series or a DataFrame back, and likewise if you pass a Series in you can still get a Series or DataFrame back. I think in some cases like the following the usage can be interchangeable def length(obj: FrameOrSeries) -> int:
...
def length(obj: Union[Series, DataFrame]) -> int:
... Because there is no type parametrization, i.e. FrameOrSeries only appears once in signature. |
@@ -336,13 +347,13 @@ def _group_selection_context(groupby): | |||
groupby._reset_group_selection() | |||
|
|||
|
|||
class _GroupBy(PandasObject, SelectionMixin): | |||
class _GroupBy(PandasObject, SelectionMixin, Generic[FrameOrSeries]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm not sure I would consider _GroupBy a generic class - is this required as part of this change or just experimenting?
closing to clear queue. No issues encountered in last few weeks. can re-open if necessary. |
@WillAyd
I think we should discuss reverting #28173 and only use
TypeVar("FrameOrSeries", bound="NDFrame")
incore.generic
. perhaps call it_NDFrameT
to avoid confusion.This PR is to show the changes required to keep mypy green if we wanted to revert.
It also reverts the change in #29458, see #29458 (comment)
_GroupBy
is defined as a generic class, but not sure aboutGrouping
I've created it as a draft PR, since I think we should wait for more issues to surface going forward to help decide. The number of changes needed at this stage is not significant enough to warrant a hasty decision.