ENH: Add separate numba kernels for groupby aggregations #53731

lithomas1 · 2023-06-19T17:51:39Z

closes #xxxx (Replace xxxx with the GitHub issue number)
Tests added and passed if fixing a bug or adding a new feature
All code checks passed.
Added type annotations to new arguments/methods/functions.
Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

This PR adds separate groupby numba kernels, since the old shared kernels with Window required an expensive sort beforehand.

e.g. look at this snakeviz (pay attention to the take call. for reference, column_looper is the part doing the actual aggregation)

Benchmarks

       before           after         ratio
     [43d12a23]       [6cf83167]
                      <groupby-numba-kernels>
-         197±5ms         66.6±1ms     0.34  groupby.GroupByNumbaAgg.time_frame_agg('float64', 'sum')
-         234±4ms       77.5±0.7ms     0.33  groupby.GroupByNumbaAgg.time_frame_agg('float64', 'var')
-         221±2ms       65.2±0.7ms     0.30  groupby.GroupByNumbaAgg.time_frame_agg('float64', 'mean')
-       20.1±0.1s         54.3±1ms     0.00  groupby.GroupByNumbaAgg.time_frame_agg('float64', 'max')
-       20.3±0.2s       54.3±0.8ms     0.00  groupby.GroupByNumbaAgg.time_frame_agg('float64', 'min')

Compared to Cython, we are still ~2x slower, but I think I can get back some more perf (would be a follow up PR).

If parallel=True is passed, though, we can be around ~2x faster than our Cython impl, though.

(This is on a Intel 2019 MBP, Intel core i7-9750H w/ 16 GB RAM)

cc @jbrockmendel if interested.
(You and I talked about parallelism in pandas a couple weeks back).

lithomas1 · 2023-06-19T17:53:53Z

pandas/core/groupby/groupby.py

@@ -1484,6 +1484,7 @@ def _numba_agg_general(
        func: Callable,
        dtype_mapping: dict[np.dtype, Any],
        engine_kwargs: dict[str, bool] | None,
+        is_grouped_kernel: bool = False,


Technically can remove this now, since all the existing numba kernels have been ported to have a groupby variant.

If you'd rather have me split this PR by kernel, this would be necessary (would remove after all the PRs land).

…pby-numba-kernels

mroeschke · 2023-06-20T17:19:38Z

While there is a nice speedup here, I am a little concerned about proliferating implementations of reductions in the spirit of #53261

pandas/core/groupby/groupby.py

pandas/core/_numba/kernels/var_.py

lithomas1 · 2023-06-20T22:16:00Z

While there is a nice speedup here, I am a little concerned about proliferating implementations of reductions in the spirit of #53261

The problem is the numba kernels are basically useless without this.

I was doing a little benchmarking a little while back and numba was ~5x slower than Cython before this change.
Because the majority of time is spent in the sorting overhead, this means that even running the numba aggregations in parallel doesn't really provide a speedup.

With these kernels at least, I think we can share the Kahan summation part for sum/mean/var.
I could probably also reimplement mean in terms of sum, so the impact of this shouldn't be too horrible.

…pby-numba-kernels

lithomas1 · 2023-07-20T03:40:09Z

Gentle ping
@jbrockmendel @mroeschke @rhshadrach

* ENH: non float64 result support in numba groupby * refactor & simplify * fix CI * maybe green? * skip unsupported ops in other bench as well * updates from code review * remove commented code * update whatsnew * debug benchmarks * Skip min/max benchmarks

jbrockmendel · 2023-07-23T17:09:32Z

pyright_reportGeneralTypeIssues.json

@@ -18,6 +18,9 @@
        "pandas/_testing/__init__.py",
        "pandas/_testing/_hypothesis.py",
        "pandas/_testing/_io.py",
+        # Numba does not like typing.cast


is there any prospect of this being fixed in numba? i.e. condition under which we can remove this?

numba/numba#5474

can you add something like "# TODO(numba#5474): can be removed once numba#5474 is addressed"

Looks like we lost the ability to add comments in the pyright json config file.

jbrockmendel · 2023-07-23T17:12:48Z

If parallel=True is passed, though, we can be around ~2x faster than our Cython impl, though.

Is this one of the use cases where there is a prospect of parallelizing the cython version too?

lithomas1 · 2023-07-24T15:47:38Z

Is this one of the use cases where there is a prospect of parallelizing the cython version too?

In the near future, no, since shipping openmp in wheels is a PITA.
I think we need an openmp runtime package/wheel or something to depend on on PyPI before moving forwards with parallelizing Cython.

numba seems to at least have figured out how to solve this problem partially. I think you can use the tbb backend if installing with pip and on conda-forge you can choose between tbb/openmp.

jbrockmendel · 2023-07-24T16:39:24Z

no objection in principle, will defer to matt since he's more involved in manitaining this code

rhshadrach · 2023-07-24T20:53:49Z

pandas/core/_numba/kernels/mean_.py

+        if nobs == 0 == min_periods:
+            result = 0.0


Is this hit? Why default to 0.0 instead of nan?

I don't think so, my bad. I copied it from the sliding window code.

This is updated now.

…pby-numba-kernels

lithomas1 · 2023-08-10T20:12:19Z

@mroeschke
Do you have time to take a look at this soon?

(I'd like to get this in, if not the apply stuff, at least for 2.1)

mroeschke

LGTM, but still hoping one day we can still have one aggregation kernel that can apply to both window and groupby (and standard) functions 🤞

mroeschke · 2023-08-10T20:57:20Z

Thanks @lithomas1

…53731) * ENH: Add separate numba kernels for groupby aggregations * add whatsnew * fixes from pre-commit * fix window tests * fix tests? * cleanup * is_grouped_kernel=True in groupby * typing * fix typing? * fix now * Just ignore * remove unnecessary code * remove comment

lithomas1 added Groupby numba numba-accelerated operations labels Jun 19, 2023

lithomas1 requested a review from rhshadrach as a code owner June 19, 2023 17:51

lithomas1 commented Jun 19, 2023

View reviewed changes

ENH: Add separate numba kernels for groupby aggregations

f29ea3e

lithomas1 force-pushed the groupby-numba-kernels branch from 6cf8316 to f29ea3e Compare June 19, 2023 17:55

lithomas1 added 5 commits June 19, 2023 10:57

add whatsnew

4922d98

fixes from pre-commit

1ca2a35

fix window tests

b8dc327

Merge branch 'main' of https://github.com/pandas-dev/pandas into grou…

20370ea

…pby-numba-kernels

fix tests?

bb57432

jbrockmendel reviewed Jun 20, 2023

View reviewed changes

pandas/core/groupby/groupby.py Outdated Show resolved Hide resolved

jbrockmendel reviewed Jun 20, 2023

View reviewed changes

pandas/core/_numba/kernels/var_.py Outdated Show resolved Hide resolved

lithomas1 added 5 commits June 20, 2023 15:21

Merge branch 'main' into groupby-numba-kernels

6fa733b

Merge branch 'main' of https://github.com/pandas-dev/pandas into grou…

7694c27

…pby-numba-kernels

cleanup

6fa32da

is_grouped_kernel=True in groupby

2808ec5

typing

9f2826e

lithomas1 requested review from jbrockmendel and mroeschke July 6, 2023 17:59

lithomas1 added 5 commits July 14, 2023 09:25

Merge branch 'main' of https://github.com/pandas-dev/pandas into grou…

e02ce75

…pby-numba-kernels

fix typing?

7bef476

fix now

d3acd26

Merge branch 'main' into groupby-numba-kernels

2dcffe0

Just ignore

2d35d78

jbrockmendel reviewed Jul 23, 2023

View reviewed changes

rhshadrach reviewed Jul 24, 2023

View reviewed changes

lithomas1 added 3 commits August 6, 2023 11:49

Merge branch 'main' of https://github.com/pandas-dev/pandas into grou…

7ed0a6b

…pby-numba-kernels

remove unnecessary code

d769fbf

remove comment

fec0278

mroeschke added this to the 2.1 milestone Aug 10, 2023

mroeschke approved these changes Aug 10, 2023

View reviewed changes

mroeschke merged commit b0e1130 into pandas-dev:main Aug 10, 2023

lithomas1 deleted the groupby-numba-kernels branch August 10, 2023 20:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: Add separate numba kernels for groupby aggregations #53731

ENH: Add separate numba kernels for groupby aggregations #53731

lithomas1 commented Jun 19, 2023 •

edited

Loading

lithomas1 Jun 19, 2023

mroeschke commented Jun 20, 2023

lithomas1 commented Jun 20, 2023

lithomas1 commented Jul 20, 2023

jbrockmendel Jul 23, 2023

lithomas1 Jul 24, 2023

jbrockmendel Jul 24, 2023

lithomas1 Aug 6, 2023

jbrockmendel commented Jul 23, 2023

lithomas1 commented Jul 24, 2023

jbrockmendel commented Jul 24, 2023

rhshadrach Jul 24, 2023

lithomas1 Aug 6, 2023

lithomas1 commented Aug 10, 2023

mroeschke left a comment

mroeschke commented Aug 10, 2023

ENH: Add separate numba kernels for groupby aggregations #53731

ENH: Add separate numba kernels for groupby aggregations #53731

Conversation

lithomas1 commented Jun 19, 2023 • edited Loading

lithomas1 Jun 19, 2023

Choose a reason for hiding this comment

mroeschke commented Jun 20, 2023

lithomas1 commented Jun 20, 2023

lithomas1 commented Jul 20, 2023

jbrockmendel Jul 23, 2023

Choose a reason for hiding this comment

lithomas1 Jul 24, 2023

Choose a reason for hiding this comment

jbrockmendel Jul 24, 2023

Choose a reason for hiding this comment

lithomas1 Aug 6, 2023

Choose a reason for hiding this comment

jbrockmendel commented Jul 23, 2023

lithomas1 commented Jul 24, 2023

jbrockmendel commented Jul 24, 2023

rhshadrach Jul 24, 2023

Choose a reason for hiding this comment

lithomas1 Aug 6, 2023

Choose a reason for hiding this comment

lithomas1 commented Aug 10, 2023

mroeschke left a comment

Choose a reason for hiding this comment

mroeschke commented Aug 10, 2023

lithomas1 commented Jun 19, 2023 •

edited

Loading