ENH: implement FloatingArray.round() #38866

arw2019 · 2020-12-31T21:43:06Z

closes BUG: pandas.Series.round not yet implemented for FloatingArray #38844
tests added / passed
passes black pandas
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry

jreback · 2021-01-01T00:11:12Z

pandas/core/arrays/floating.py

@@ -410,6 +410,14 @@ def max(self, *, skipna=True, **kwargs):
        nv.validate_max((), kwargs)
        return super()._reduce("max", skipna=skipna)

+    def round(self, decimals=0):


should do this generically (and move to NumericArray), e.g. something like

def round(self, decimals=0): return self._apply(np.round, decimals=decimals)

where _apply handles the masking & recasting. I think we discussed similar in .to_numeric

or event better

round = _apply_function(np.round, decimals=0, "nice doc-string here")

If we're going with the second way (may as well) would _apply_function be method on NumericArray (versus a module level function)?

simonjayhawkins · 2021-01-01T14:27:49Z

milestoned as 1.2.1. ok to merge changes to experimental types in patch releases?

jreback · 2021-01-03T16:24:53Z

these generally should not be backported as EAs (esp) Floating are still experimental

jreback · 2021-01-03T16:25:20Z

milestoned as 1.2.1. ok to merge changes to experimental types in patch releases?

we can do on a case by case. the proposed fix here that i suggest is a bit more non-trivial.

…38844-FloatingArray-round

jreback · 2021-01-05T00:38:17Z

pandas/core/arrays/numeric.py

+
+    def _apply(self, func: Callable, **kwargs) -> "NumericArray":
+        values = self._data[~self._mask]
+        values = np.round(values, **kwargs)


should be func :->

jreback · 2021-01-05T00:38:28Z

pandas/core/arrays/numeric.py

@@ -130,3 +158,16 @@ def _arith_method(self, other, op):
            )

        return self._maybe_mask_result(result, mask, other, op_name)
+
+    def _apply(self, func: Callable, **kwargs) -> "NumericArray":
+        values = self._data[~self._mask]


can you add a doc-string

jreback · 2021-01-05T00:39:16Z

pandas/core/arrays/numeric.py

@@ -130,3 +158,16 @@ def _arith_method(self, other, op):
            )

        return self._maybe_mask_result(result, mask, other, op_name)
+
+    def _apply(self, func: Callable, **kwargs) -> "NumericArray":


cc @jbrockmendel @jorisvandenbossche

shall we make this more general? (e.g. on base.py)

(I would for this PR leave it here in order to do the minimal to actually implement the round(), and have a follow-up to discuss how we might want to use this more general, because indeed we probably want that)

jorisvandenbossche · 2021-01-05T08:51:44Z

Since I think being able to round is a quite essential functionality for a floating dtypes that we missed, I think it makes sense to backport this, exactly because it is still experimental anyway.
(since it's experimental anyway, we don't need to be as careful to backport as for other parts of pandas, and by backporting this, it will make it easier for users to actually already experiment with the new dtype in pandas 1.2.x, and thus we can get more feedback)

jorisvandenbossche

@arw2019 thanks for working on this!

I think this needs some more testing, and also for the actual array methods (I would maybe create a new test_function.py in the /tests/arrays/masked/ dir for this)

jorisvandenbossche · 2021-01-05T08:52:27Z

doc/source/whatsnew/v1.3.0.rst

@@ -54,6 +54,7 @@ Other enhancements
 - Add support for dict-like names in :class:`MultiIndex.set_names` and :class:`MultiIndex.rename` (:issue:`20421`)
 - :func:`pandas.read_excel` can now auto detect .xlsb files (:issue:`35416`)
 - :meth:`.Rolling.sum`, :meth:`.Expanding.sum`, :meth:`.Rolling.mean`, :meth:`.Expanding.mean`, :meth:`.Rolling.median`, :meth:`.Expanding.median`, :meth:`.Rolling.max`, :meth:`.Expanding.max`, :meth:`.Rolling.min`, and :meth:`.Expanding.min` now support ``Numba`` execution with the ``engine`` keyword (:issue:`38895`)
+- Added :meth:`NumericArray.round` (:issue:`38844`)


NumericArray is not public, and thus we shouldn't mention it in the whatsnew notes. You can say something about "round() being enabled for the nullable integer and floating dtypes"

jorisvandenbossche · 2021-01-05T08:52:52Z

pandas/core/arrays/numeric.py

+        data[~self._mask] = values
+        return type(self)(data, self._mask)
+
+    @doc(_round_doc)


The docstring can be moved inline?

jorisvandenbossche · 2021-01-05T08:59:12Z

pandas/core/arrays/numeric.py

@@ -130,3 +158,16 @@ def _arith_method(self, other, op):
            )

        return self._maybe_mask_result(result, mask, other, op_name)
+
+    def _apply(self, func: Callable, **kwargs) -> "NumericArray":


(I would for this PR leave it here in order to do the minimal to actually implement the round(), and have a follow-up to discuss how we might want to use this more general, because indeed we probably want that)

jorisvandenbossche · 2021-01-05T09:01:44Z

pandas/core/arrays/numeric.py

@@ -130,3 +158,16 @@ def _arith_method(self, other, op):
            )

        return self._maybe_mask_result(result, mask, other, op_name)
+
+    def _apply(self, func: Callable, **kwargs) -> "NumericArray":
+        values = self._data[~self._mask]


I don't think you actually need to subset the _data with the mask in this case, as "round" should work on all values, and I can't think of a case where it would error by being called on the "invalid" values hidden by the mask.

Of course, if many values are masked, we might be calculating round on too many values. But doing the filter operation / copy also takes time. Maybe something to time both ways.

jorisvandenbossche · 2021-01-05T09:04:02Z

pandas/core/arrays/numeric.py

+
+        data = np.zeros(self._data.shape)
+        data[~self._mask] = values
+        return type(self)(data, self._mask)


The mask needs to be copied I think? (result should not share a mask with the original array, because otherwise editing one can modify the other. We should probably also test this)

That's a good point (and actually the same bug exists already in my implementation of to_numeric for EAs - #38974). I'll fix this and add tests

jorisvandenbossche · 2021-01-05T09:05:41Z

pandas/core/arrays/numeric.py

+
+    @doc(_round_doc)
+    def round(self, decimals: int = 0, *args, **kwargs) -> "NumericArray":
+        nv.validate_round(args, kwargs)


If we accept args/kwargs here and validate them, then we should also test this (eg doing np.round(float_arr) triggers this)

jreback · 2021-01-06T22:22:46Z

Since I think being able to round is a quite essential functionality for a floating dtypes that we missed, I think it makes sense to backport this, exactly because it is still experimental anyway.
(since it's experimental anyway, we don't need to be as careful to backport as for other parts of pandas, and by backporting this, it will make it easier for users to actually already experiment with the new dtype in pandas 1.2.x, and thus we can get more feedback)

I don't think we should be backporting things like this.

github-actions · 2021-02-06T00:12:22Z

This pull request is stale because it has been open for thirty days with no activity. Please update or respond to this comment if you're still interested in working on this.

jreback · 2021-02-11T01:30:19Z

closing as stale. I think we need the general soln outlined above. Happy to re-open / take new PR.

arw2019 added 3 commits December 31, 2020 15:47

add test for nullable float

1df9bf7

implement round method for FloatingArray

192ebad

use pytest fixture

6403026

arw2019 marked this pull request as draft December 31, 2020 21:43

arw2019 added Series Series data structure NA - MaskedArrays Related to pd.NA and nullable extension arrays labels Dec 31, 2020

jreback requested changes Jan 1, 2021

View reviewed changes

simonjayhawkins added this to the 1.2.1 milestone Jan 1, 2021

jreback modified the milestones: 1.2.1, 1.3 Jan 3, 2021

arw2019 added 5 commits January 4, 2021 17:39

review: rewriting using _apply

253d279

Merge branch 'master' of https://github.com/pandas-dev/pandas into GH…

3e0d605

…38844-FloatingArray-round

remove explicit FloatingArray reference from _apply

fe19597

move _apply, round to NumericArray

b557a50

move _apply, round to NumericArray

85e6d4f

arw2019 marked this pull request as ready for review January 4, 2021 22:49

arw2019 added 2 commits January 4, 2021 18:18

typing, consistency with Series.round

f79c0cd

whatsnew

d1cd6d8

jreback requested changes Jan 5, 2021

View reviewed changes

jorisvandenbossche reviewed Jan 5, 2021

View reviewed changes

github-actions bot added the Stale label Feb 6, 2021

jreback closed this Feb 11, 2021

benoit9126 mentioned this pull request Feb 11, 2021

ENH: Implement rounding for floating dtype array #38844 #39751

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: implement FloatingArray.round() #38866

ENH: implement FloatingArray.round() #38866

arw2019 commented Dec 31, 2020 •

edited

Loading

jreback Jan 1, 2021

jreback Jan 1, 2021

arw2019 Jan 4, 2021

simonjayhawkins commented Jan 1, 2021

jreback commented Jan 3, 2021

jreback commented Jan 3, 2021

jreback Jan 5, 2021

jreback Jan 5, 2021

jreback Jan 5, 2021

jorisvandenbossche Jan 5, 2021

jorisvandenbossche commented Jan 5, 2021

jorisvandenbossche left a comment

jorisvandenbossche Jan 5, 2021

jorisvandenbossche Jan 5, 2021

jorisvandenbossche Jan 5, 2021

jorisvandenbossche Jan 5, 2021

jorisvandenbossche Jan 5, 2021

arw2019 Jan 5, 2021

jorisvandenbossche Jan 5, 2021

jreback commented Jan 6, 2021

github-actions bot commented Feb 6, 2021

jreback commented Feb 11, 2021

ENH: implement FloatingArray.round() #38866

ENH: implement FloatingArray.round() #38866

Conversation

arw2019 commented Dec 31, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

simonjayhawkins commented Jan 1, 2021

jreback commented Jan 3, 2021

jreback commented Jan 3, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jorisvandenbossche commented Jan 5, 2021

jorisvandenbossche left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jreback commented Jan 6, 2021

github-actions bot commented Feb 6, 2021

jreback commented Feb 11, 2021

arw2019 commented Dec 31, 2020 •

edited

Loading