Skip to content

Semantics between Series.apply and Series.agg incorrect #51880

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mhriemers opened this issue Mar 10, 2023 · 2 comments
Closed

Semantics between Series.apply and Series.agg incorrect #51880

mhriemers opened this issue Mar 10, 2023 · 2 comments
Labels
Apply Apply, Aggregate, Transform, Map Series Series data structure

Comments

@mhriemers
Copy link

Correct me if I'm wrong, but isn't Series.agg supposed to reduce all rows to a singular value and Series.apply supposed to apply an operation to each individual row? Currently, when a function doesn't throw one of ValueError, AttributeError, TypeError, it is assumed to be a function meant for Series.apply even when passed to Series.agg.

Consider the following example:

def one_sample_t_test(r, mu_0):
    return (np.mean(r) - mu_0) / (np.std(r) / np.sqrt(len(r)))

s.agg(lambda r: one_sample_t_test(r, 0))

This generates exactly the same series as before because the function is called with individual values instead of the whole frame. When used in conjunction with another aggregator, like mean, it crashes since it can't concatenate a singular value with a series.

pandas/pandas/core/apply.py

Lines 1040 to 1051 in c293caf

# try a regular apply, this evaluates lambdas
# row-by-row; however if the lambda is expected a Series
# expression, e.g.: lambda x: x-x.quantile(0.25)
# this will fail, so we can try a vectorized evaluation
# we cannot FIRST try the vectorized evaluation, because
# then .agg and .apply would have different semantics if the
# operation is actually defined on the Series, e.g. str
try:
result = self.obj.apply(f)
except (ValueError, AttributeError, TypeError):
result = f(self.obj)

Why would somebody ever call Series.agg with a function that only works on singular values?

@mhriemers
Copy link
Author

Related to #49673.

@DeaMariaLeon DeaMariaLeon added Apply Apply, Aggregate, Transform, Map Series Series data structure labels Mar 10, 2023
@rhshadrach
Copy link
Member

rhshadrach commented Apr 23, 2023

Thanks for the report - I think this is a duplicate of #49673. Please comment here if you think this is not accurate and can reopen.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Apply Apply, Aggregate, Transform, Map Series Series data structure
Projects
None yet
Development

No branches or pull requests

3 participants