-
-
Notifications
You must be signed in to change notification settings - Fork 141
Fix value_counts and apply. #401
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
It's possible for the callable in Series.apply to return something non-hashable like a list, but the result of apply should still be a Series.
Whether it returns a Series or a DataFrame depends on the return type of the callable. In the case of the callable returning a scalar, the result is a Series unless the result_type is "broadcast".
acbd4a9
to
e8d6a77
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This all looks good, and the changes for apply
should fix #393 . Can you add a test from that issue to this PR and then we can close that issue?
@@ -685,7 +685,7 @@ class Series(IndexOpsMixin, NDFrame, Generic[S1]): | |||
@overload | |||
def apply( | |||
self, | |||
func: Callable[..., Hashable], | |||
func: Callable[..., Scalar | Sequence | Mapping], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't we basically assume everywhere in this code base that the expectation is that dataframes and series only store hashable/scalar types, and NOT things like dicts
, or arbitrary objects for the matter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but in this case, it doesn't matter, because he is defining a Callable
that returns types, that apply
will then use to return a Series
.
@Dr-Irv test added as requested. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @danielroseman
Fix a few issues.
Series.value_counts
was wrongly typed as returning a series of the same type as the original (Series[S1]
), when it actually returnsSeries[int]
.Series.apply
does not need to be hashable; it can return any value valid for a Series, which can include non-hashable items like lists or dicts.DataFrame.apply
, likeDataFrameGroupBy.apply
, returns a DataFrame or Series depending on the return value of the callable, not on the set of parameters given. The exception is whenresult_type
is "broadcast", in which case the result is always a DataFrame.assert_type()
to assert the type of any return value