Skip to content

API: round() method to ignore non-numerical columns? #11885

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jorisvandenbossche opened this issue Dec 22, 2015 · 3 comments · Fixed by #11923
Closed

API: round() method to ignore non-numerical columns? #11885

jorisvandenbossche opened this issue Dec 22, 2015 · 3 comments · Fixed by #11923
Labels
API Design Dtype Conversions Unexpected or buggy dtype conversions
Milestone

Comments

@jorisvandenbossche
Copy link
Member

If you call the new round method on a mixed dataframe (not only numerical columns), you get an error:

In [1]: df = pd.DataFrame({'floats':[0.123, 0.156, 5.687], 'strings': ['a', 'b', 'c']})

In [2]: df.round(2)
TypeError: can't multiply sequence by non-int of type 'float'

In [3]: df.round()
AttributeError: 'str' object has no attribute 'rint'

Would it be better to ignore these non-numerical columns instead of raising an error?
Another option would be to drop these columns (like numerical aggregations like df.mean() do).

@shoyer
Copy link
Member

shoyer commented Dec 23, 2015

I think it would make sense to ignore non-numerical columns.

What about time columns? Didn't we add support for round with times recently, or is that still in the works?

@jorisvandenbossche
Copy link
Member Author

There is indeed new support for rounding datetime64 series, but the problem is that this goes through the accessor method .dt.round() and not .round(). So it is not really obvious how to integrate that.

In [1]: df = pd.DataFrame({'floats':[0.123, 0.156, 5.687], 'dates':pd.date_range('2012-01-01', periods=3)})

In [3]: df.round(2)
TypeError: ufunc multiply cannot use operands with types dtype('<M8[ns]') and dtype('float64')

In [4]: df['dates'].round()
TypeError: ufunc 'rint' not supported for the input types, and the inputs could
not be safely coerced to any supported types according to the casting rule ''safe''

That's a bit in general a problem with specific dtypes that have specialized versions of standard methods requiring other keywords.
Because we could also give access to datetime rounding through Series.round() instead of Series.dt.round()?

In any case, that is maybe another discussion, so for now I would say to also ignore datetime and period columns.

@jreback
Copy link
Contributor

jreback commented Dec 23, 2015

.round should only apply to float/int columns normally. I would simply skip all others (but include them in the return). I think we could pass thru the freq kw to DataFrame.round to handle datetimelike rounding.

further Series.round could defer to .dt. for .round behavior of datetimelikes (and with the freq arg non-optional in that case). we can make that another issue I think.

@jreback jreback added Dtype Conversions Unexpected or buggy dtype conversions Difficulty Intermediate and removed Difficulty Intermediate labels Dec 23, 2015
@jreback jreback added this to the Next Major Release milestone Dec 23, 2015
@jreback jreback modified the milestones: 0.18.0, Next Major Release Dec 29, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design Dtype Conversions Unexpected or buggy dtype conversions
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants