-
-
Notifications
You must be signed in to change notification settings - Fork 141
added return types for "SeriesGroupBy" and "DataFrameGroupBy" #455
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
tests/test_frame.py
Outdated
grouped1 = ser.groupby(ser > 100) | ||
c1 = grouped.transform(lambda x: (x - x.mean()) / x.std()) | ||
c2 = grouped1.transform(lambda x: x.max() - x.min()) | ||
check(assert_type(c1, "pd.DataFrame"), pd.DataFrame) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't need quotes here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done sir
@@ -61,7 +61,7 @@ class SeriesGroupBy(GroupBy, Generic[S1]): | |||
def agg(self, func: list[AggFuncTypeBase], *args, **kwargs) -> DataFrame: ... | |||
@overload | |||
def agg(self, func: AggFuncTypeBase, *args, **kwargs) -> Series: ... | |||
def transform(self, func, *args, **kwargs): ... | |||
def transform(self, func, *args, **kwargs) -> Series: ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add func: Callable
here (and below)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
after adding Callable line number 643 and 645 showa this error Argument 1 to "transform" of "DataFrameGroupBy" has incompatible type "str"; expected "Callable[..., Any]" [arg-type]
but if we add Callable[..., str]
it also shows the same error
@@ -61,7 +61,7 @@ class SeriesGroupBy(GroupBy, Generic[S1]): | |||
def agg(self, func: list[AggFuncTypeBase], *args, **kwargs) -> DataFrame: ... | |||
@overload | |||
def agg(self, func: AggFuncTypeBase, *args, **kwargs) -> Series: ... | |||
def transform(self, func, *args, **kwargs): ... | |||
def transform(self, func: Callable, *args, **kwargs) -> Series: ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like it needs to be Callable | str
(in both of them)
Can you also create an issue in the pandas repo saying that the docs for transform
need to indicate that the function parameter in groupby.transform()
can be a string function name, and that an example should be included?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes I'll do that
tests/test_frame.py
Outdated
c1 = grouped.transform(lambda x: (x - x.mean()) / x.std()) | ||
c2 = grouped1.transform(lambda x: x.max() - x.min()) | ||
check(assert_type(c1, pd.DataFrame), pd.DataFrame) | ||
check(assert_type(c2, pd.Series), pd.Series) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add tests for grouped.transform("sum")
and grouped1.transform("cumsum")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes sir
@@ -166,7 +166,7 @@ class DataFrameGroupBy(GroupBy): | |||
) -> DataFrame: ... | |||
def aggregate(self, arg: AggFuncTypeFrame = ..., *args, **kwargs) -> DataFrame: ... | |||
agg = aggregate | |||
def transform(self, func, *args, **kwargs): ... | |||
def transform(self, func: Callable | str, *args, **kwargs) -> DataFrame: ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can/should we make this a Literal
?
From https://github.com/pandas-dev/pandas/blob/main/pandas/core/groupby/base.py#L60, the transformation kernels would be the ones applicable here:
transformation_kernels = Literal[
"bfill",
"cumcount",
"cummax",
"cummin",
"cumprod",
"cumsum",
"diff",
"ffill",
"fillna",
"ngroup",
"pct_change",
"rank",
"shift",
]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any way to do better than Callable
? Like what about Callable[Concatenate[pd.DataFrame, P], pd.DataFrame]
? Will ParamSpec
work in that context? If not, can we use a Protocol
? I know that we can't use P.args
on *args
and P.kwargs
on **kwargs
, though that would have been a very nice feature.
I'm just thinking, if we are going from an unannotated function to an annotated function, a loose annotation doesn't really do that much good versus not having an annotation at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can do this , but I think it would be better to ask Dr Irv about adding it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rather be incremental in terms of PR's where we could specify the Literal
values (there are more than just the ones you listed), as well as making the func
argument more specific. @gandhis1 feel free to create an issue indicating that we could improve the typing on groupby.transform
by being more specific about the allowed argument combinations. It also would require a lot more tests.
Focus of this PR was just to get the return types right, and do a slight improvement in the func
argument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @ramvikrams
Thanks @Dr-Irv sir |
assert_type()
to assert the type of any return value