Skip to content

QST: Behaviour of mean() for a Series of strings. #34957

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
shwina opened this issue Jun 23, 2020 · 2 comments
Closed

QST: Behaviour of mean() for a Series of strings. #34957

shwina opened this issue Jun 23, 2020 · 2 comments
Labels
Duplicate Report Duplicate issue or pull request Usage Question

Comments

@shwina
Copy link
Contributor

shwina commented Jun 23, 2020

Greetings, Pandas devs! My question is about the behaviour of mean() for a Series of strings:

In [1]: import pandas as pd

In [2]: pd.Series(['1', '2']).mean()
Out[2]: 6.0

From #34671 (comment), the result is 6 because:

"1" + "2" is "12", and then we convert to numeric 12, and divide by the count

This feels like somewhat implicit behaviour that makes assumptions about how the mean is computed.

Would it be more desirable to throw here? Or perhaps return 1.5 instead?

@galipremsagar
Copy link

commenting to follow thread.

@rhshadrach
Copy link
Member

Thanks for the question - as the discussion in the issue you mentioned indicates, 6.0 is not the desired output. If you have any feedback/suggestions, please feel welcome to mention them in the original issue rather than have discussions going on in two separate issues.

Closing as a duplicate of #34671

@rhshadrach rhshadrach added Duplicate Report Duplicate issue or pull request and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Jun 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Duplicate Report Duplicate issue or pull request Usage Question
Projects
None yet
Development

No branches or pull requests

3 participants