ENH: Implement cummax and cummin in _accumulate() for ordered Categorical arrays #58360

bdwzhangumich · 2024-04-21T20:15:45Z

closes BUG: Can't compute cummin/cummax on ordered categorical #52335 (addresses first example but not second)
Tests added and passed if fixing a bug or adding a new feature
All code checks passed.
Added type annotations to new arguments/methods/functions.
Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

We implemented the operations cummax and cummin in the _accumulate() function for the Categorical array class. The implementation includes a check that the Categorical array is ordered and is based on other implementations of _accumulate(). Tests for the implementation have also been written and included.

Coauthored with @xiangchris

…das-brandon into accumulate_categorical

… pd.Categorical

…das-brandon into accumulate_categorical

doc/source/whatsnew/v3.0.0.rst

mroeschke · 2024-04-22T17:24:54Z

pandas/core/array_algos/categorical_accumulations.py

+import numpy as np
+
+
+def _cum_func(


Can you just include all this logic in pandas/core/arrays/categorical.py

I moved the logic into pandas/core/arrays/categorical.py

pandas/tests/arrays/categorical/test_cumulative.py

Co-authored-by: Matthew Roeschke <[email protected]>

…das-brandon into accumulate_categorical

mroeschke · 2024-04-23T17:52:11Z

pandas/tests/series/test_cumulative.py

+            ["a", np.nan, "b", "a", "c"],
+            dtype=pd.CategoricalDtype(list(order), ordered=True),
+        )
+        result = getattr(ser, method)(skipna=True)


Could you use pytest.mark.parametrize with skipna=True|False and the expected result?

Added a pytest.mark.parametrize to test_cummax_cummin_ordered_categorical_nan with skipna and expected data

mroeschke · 2024-04-23T17:52:27Z

pandas/tests/series/test_cumulative.py

+            pd.Series(
+                ["a", np.nan, "b", "b", "c"],
+                dtype=pd.CategoricalDtype(list(order), ordered=True),


Can you assign this to an expected variable?

Assigned all expected series to expected variables

…ametrize to test_cummax_cummin_ordered_categorical_nan with skipna and expected data

mroeschke · 2024-04-23T20:07:49Z

Thanks @bdwzhangumich

…ical arrays (pandas-dev#58360) * Added tests with and without np.nan * Added tests for cummin and cummax * Fixed series tests expected series, rewrote categorical arrays to use pd.Categorical * Fixed cat not defined error and misspelling * Implement _accumulate for Categorical * fixed misspellings in tests * fixed expected categories on tests * Updated whatsnew * Update doc/source/whatsnew/v3.0.0.rst Co-authored-by: Matthew Roeschke <[email protected]> * Removed testing for _accumulate. * Moved categorical_accumulations.py logic to categorical.py * Assigned expected results to expected variable; Added pytest.mark.parametrize to test_cummax_cummin_ordered_categorical_nan with skipna and expected data --------- Co-authored-by: Christopher Xiang <[email protected]> Co-authored-by: Matthew Roeschke <[email protected]> Co-authored-by: Chris Xiang <[email protected]>

xiangchris and others added 12 commits April 14, 2024 02:02

Added tests with and without np.nan

fa74733

Added tests for cummin and cummax

269dcfb

Merge branch 'accumulate_categorical' of github.com:bdwzhangumich/pan…

a8a1f37

…das-brandon into accumulate_categorical

Fixed series tests expected series, rewrote categorical arrays to use…

9ec47bf

… pd.Categorical

Fixed cat not defined error and misspelling

c224faf

Implement _accumulate for Categorical

330b60d

Merge branch 'pandas-dev:main' into accumulate_categorical

e927cda

fixed misspellings in tests

0122334

Merge branch 'accumulate_categorical' of github.com:bdwzhangumich/pan…

4caea9f

…das-brandon into accumulate_categorical

fixed expected categories on tests

098ad64

Updated whatsnew

126bc19

Merge branch 'pandas-dev:main' into accumulate_categorical

498ad6d

mroeschke reviewed Apr 22, 2024

View reviewed changes

doc/source/whatsnew/v3.0.0.rst Outdated Show resolved Hide resolved

mroeschke reviewed Apr 22, 2024

View reviewed changes

pandas/tests/arrays/categorical/test_cumulative.py Outdated Show resolved Hide resolved

mroeschke added Categorical Categorical Data Type Reduction Operations sum, mean, min, max, etc. labels Apr 22, 2024

bdwzhangumich and others added 4 commits April 22, 2024 13:31

Update doc/source/whatsnew/v3.0.0.rst

18a86e2

Co-authored-by: Matthew Roeschke <[email protected]>

Removed testing for _accumulate.

5c5ac3c

Merge branch 'accumulate_categorical' of github.com:bdwzhangumich/pan…

7934f74

…das-brandon into accumulate_categorical

Moved categorical_accumulations.py logic to categorical.py

1abad3e

bdwzhangumich requested a review from mroeschke April 22, 2024 19:26

Merge branch 'main' into accumulate_categorical

6be5f8c

mroeschke reviewed Apr 23, 2024

View reviewed changes

Assigned expected results to expected variable; Added pytest.mark.par…

babc835

…ametrize to test_cummax_cummin_ordered_categorical_nan with skipna and expected data

bdwzhangumich requested a review from mroeschke April 23, 2024 19:13

mroeschke added this to the 3.0 milestone Apr 23, 2024

mroeschke approved these changes Apr 23, 2024

View reviewed changes

mroeschke merged commit 9d5c88e into pandas-dev:main Apr 23, 2024
45 of 46 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: Implement cummax and cummin in _accumulate() for ordered Categorical arrays #58360

ENH: Implement cummax and cummin in _accumulate() for ordered Categorical arrays #58360

bdwzhangumich commented Apr 21, 2024

mroeschke Apr 22, 2024

bdwzhangumich Apr 22, 2024

mroeschke Apr 23, 2024

xiangchris Apr 23, 2024

mroeschke Apr 23, 2024

xiangchris Apr 23, 2024

mroeschke commented Apr 23, 2024

		import numpy as np


		def _cum_func(

ENH: Implement cummax and cummin in _accumulate() for ordered Categorical arrays #58360

ENH: Implement cummax and cummin in _accumulate() for ordered Categorical arrays #58360

Conversation

bdwzhangumich commented Apr 21, 2024

mroeschke Apr 22, 2024

Choose a reason for hiding this comment

bdwzhangumich Apr 22, 2024

Choose a reason for hiding this comment

mroeschke Apr 23, 2024

Choose a reason for hiding this comment

xiangchris Apr 23, 2024

Choose a reason for hiding this comment

mroeschke Apr 23, 2024

Choose a reason for hiding this comment

xiangchris Apr 23, 2024

Choose a reason for hiding this comment

mroeschke commented Apr 23, 2024