add uint64 support for some libgroupby funcs #28931

jbrockmendel · 2019-10-11T17:19:03Z

cc @WillAyd I hope you agree the runtime_error thing here is easier to implement/review with fused type than it would be with tempita.

WillAyd · 2019-10-11T21:40:18Z

Can you add tests for these as well? Would be nice to have something parametrized in groupby

jbrockmendel · 2019-10-11T21:56:38Z

Can you add tests for these as well? Would be nice to have something parametrized in groupby

I'll take a look and see if we have tests that get at these functions directly (as opposed to via df.groupby.whatever). I can attest that these are hit by the python code for which this adds a comment.

jbrockmendel · 2019-10-12T16:29:26Z

Following this and #28934 one more obvious cleanup is to move all of groupby_helper into groupby.pyx, since we are no longer using the templating

jbrockmendel · 2019-10-12T16:30:08Z

@WillAyd it looks like we don't have any dedicated testing that gets directly a the functions in this file.

jreback · 2019-10-12T17:01:47Z

pandas/_libs/groupby_helper.pxi.in

                        else:
                            out[i, j] = NAN
                    else:
                        out[i, j] = resx[i, j]

+    if runtime_error:
+        # We cannot raise directly above because that is within a nogil


where are these caught?

this would get caught by one of the except Exceptions in the groupby code that i'm trying to make more specific.

jbrockmendel · 2019-10-14T14:43:41Z

Any major concerns? This is a blocker for getting at the except Exceptions in/around _cython_agg_blocks

WillAyd

I think you can add to the parametrization of this test here:

pandas/pandas/tests/groupby/test_function.py

Line 394 in 18a9e4c

def test_groupby_non_arithmetic_agg_types(dtype, method, data):

WillAyd · 2019-10-14T14:56:12Z

pandas/_libs/groupby_helper.pxi.in

@@ -439,6 +466,9 @@ def group_max(groupby_t[:, :] out,
        # Note: evaluated at compile-time
        maxx[:] = -_int64_max
        nan_val = NPY_NAT
+    elif groupby_t is uint64_t:
+        maxx[:] = 0
+        nan_val = 0  # We better not need this!  TODO: Can we use e.g. NULL?


Can you clarify what this is?

We have to define a value to treat as NA, and it has to be of the appropriate dtype. But for uint64 there is no such value, so defining it here is just as a placeholder.

I'll clarify the comment in the code.

…ster41

jbrockmendel · 2019-10-14T17:18:29Z

I think you can add to the parametrization of this test here:

@WillAyd added uint64 to this parametrization.

WillAyd

lgtm @jreback

WillAyd · 2019-10-15T14:40:41Z

I think a whatsnew for function support would be nice as well

jbrockmendel · 2019-10-15T15:32:24Z

I think a whatsnew for function support would be nice as well

Only user-facing effect should be perf, which I haven't measured yet. Would prefer to do measurements in one thorough pass after the last except Exception is removed around cython_agg_general (this plus two PRs)

jreback · 2019-10-16T12:42:28Z

pandas/tests/groupby/test_function.py

@@ -378,7 +378,7 @@ def test_median_empty_bins(observed):


 @pytest.mark.parametrize(
-    "dtype", ["int8", "int16", "int32", "int64", "float32", "float64"]
+    "dtype", ["int8", "int16", "int32", "int64", "float32", "float64", "uint64"]


maybe use any_real_type fixture here (can be followup)

jreback · 2019-10-16T12:42:50Z

thanks

add uint64 support for some libgroupby funcs

0de0628

jbrockmendel mentioned this pull request Oct 11, 2019

REF: de-duplicate groupby_helper code #28934

Merged

WillAyd added Dtype Conversions Unexpected or buggy dtype conversions Groupby labels Oct 11, 2019

jreback reviewed Oct 12, 2019

View reviewed changes

jreback added this to the 1.0 milestone Oct 12, 2019

WillAyd requested changes Oct 14, 2019

View reviewed changes

jbrockmendel added 2 commits October 14, 2019 09:52

Merge branch 'master' of https://github.com/pandas-dev/pandas into fa…

3d66557

…ster41

clarify comment, add to parametrization

77a0b7e

WillAyd approved these changes Oct 15, 2019

View reviewed changes

jreback reviewed Oct 16, 2019

View reviewed changes

jreback merged commit a0d01b8 into pandas-dev:master Oct 16, 2019

jbrockmendel deleted the faster41 branch October 16, 2019 15:25

proost pushed a commit to proost/pandas that referenced this pull request Dec 19, 2019

add uint64 support for some libgroupby funcs (pandas-dev#28931)

8386a56

proost pushed a commit to proost/pandas that referenced this pull request Dec 19, 2019

add uint64 support for some libgroupby funcs (pandas-dev#28931)

e336fe1

bongolegend pushed a commit to bongolegend/pandas that referenced this pull request Jan 1, 2020

add uint64 support for some libgroupby funcs (pandas-dev#28931)

4bd8d6d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add uint64 support for some libgroupby funcs #28931

add uint64 support for some libgroupby funcs #28931

jbrockmendel commented Oct 11, 2019

WillAyd commented Oct 11, 2019

jbrockmendel commented Oct 11, 2019

jbrockmendel commented Oct 12, 2019

jbrockmendel commented Oct 12, 2019

jreback Oct 12, 2019

jbrockmendel Oct 12, 2019

jbrockmendel commented Oct 14, 2019

WillAyd left a comment

WillAyd Oct 14, 2019

jbrockmendel Oct 14, 2019

jbrockmendel commented Oct 14, 2019

WillAyd left a comment

WillAyd commented Oct 15, 2019

jbrockmendel commented Oct 15, 2019

jreback Oct 16, 2019

jreback commented Oct 16, 2019

add uint64 support for some libgroupby funcs #28931

add uint64 support for some libgroupby funcs #28931

Conversation

jbrockmendel commented Oct 11, 2019

WillAyd commented Oct 11, 2019

jbrockmendel commented Oct 11, 2019

jbrockmendel commented Oct 12, 2019

jbrockmendel commented Oct 12, 2019

jreback Oct 12, 2019

Choose a reason for hiding this comment

jbrockmendel Oct 12, 2019

Choose a reason for hiding this comment

jbrockmendel commented Oct 14, 2019

WillAyd left a comment

Choose a reason for hiding this comment

WillAyd Oct 14, 2019

Choose a reason for hiding this comment

jbrockmendel Oct 14, 2019

Choose a reason for hiding this comment

jbrockmendel commented Oct 14, 2019

WillAyd left a comment

Choose a reason for hiding this comment

WillAyd commented Oct 15, 2019

jbrockmendel commented Oct 15, 2019

jreback Oct 16, 2019

Choose a reason for hiding this comment

jreback commented Oct 16, 2019