Skip to content

Fix value_counts and apply. #401

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Oct 27, 2022
Merged

Conversation

danielroseman
Copy link
Contributor

Fix a few issues.

  • Series.value_counts was wrongly typed as returning a series of the same type as the original (Series[S1]), when it actually returns Series[int].
  • The result of the callable given to Series.apply does not need to be hashable; it can return any value valid for a Series, which can include non-hashable items like lists or dicts.
  • DataFrame.apply, like DataFrameGroupBy.apply, returns a DataFrame or Series depending on the return value of the callable, not on the set of parameters given. The exception is when result_type is "broadcast", in which case the result is always a DataFrame.
  • Tests added: Please use assert_type() to assert the type of any return value

It's possible for the callable in Series.apply to return something
non-hashable like a list, but the result of apply should still be a
Series.
Whether it returns a Series or a DataFrame depends on the return type of
the callable. In the case of the callable returning a scalar, the result
is a Series unless the result_type is "broadcast".
Copy link
Collaborator

@Dr-Irv Dr-Irv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all looks good, and the changes for apply should fix #393 . Can you add a test from that issue to this PR and then we can close that issue?

@@ -685,7 +685,7 @@ class Series(IndexOpsMixin, NDFrame, Generic[S1]):
@overload
def apply(
self,
func: Callable[..., Hashable],
func: Callable[..., Scalar | Sequence | Mapping],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we basically assume everywhere in this code base that the expectation is that dataframes and series only store hashable/scalar types, and NOT things like dicts, or arbitrary objects for the matter?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but in this case, it doesn't matter, because he is defining a Callable that returns types, that apply will then use to return a Series.

@danielroseman
Copy link
Contributor Author

@Dr-Irv test added as requested.

Copy link
Collaborator

@Dr-Irv Dr-Irv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants