REF: Pre-empt ValueError in _aggregate_series_fast #29500

jbrockmendel · 2019-11-09T01:58:13Z

cc @WillAyd @jreback orthogonal to other groupby PRs (including #29499) but will have merge conflicts. No preference on what order they should go in.

…ster-ngroups3

simonjayhawkins

Thanks @jbrockmendel. a few comments otherwise lgtm.

pandas/tests/groupby/test_bin_groupby.py

pandas/_libs/reduction.pyx

pandas/tests/groupby/test_bin_groupby.py

pandas/core/groupby/ops.py

pandas/tests/groupby/test_bin_groupby.py

pandas/_libs/reduction.pyx

WillAyd · 2019-11-10T00:27:11Z

pandas/core/groupby/ops.py

@@ -583,7 +583,11 @@ def _transform(
        return result

    def agg_series(self, obj: Series, func):
-        if is_extension_array_dtype(obj.dtype) and obj.dtype.kind != "M":
+        if len(obj) == 0:
+            # SeriesGrouper would raise if we were to call _aggregate_series_fast


Hmm should we not just fix _aggregate_series_fast to not raise in that case then? Hoping to avoid special casing like this in any groupby functions

Hoping to avoid special casing like this in any groupby functions

Agreed on the goal. ATM this is the best way to make progress towards that goal. In particular, making this explicit here is much clearer than catching the ValueError with the particular message on L600.

should we not just fix _aggregate_series_fast to not raise in that case then?

#29499 does something along those lines.

WillAyd · 2019-11-10T00:29:08Z

pandas/tests/groupby/test_bin_groupby.py

+    dummy = obj[:0]
+    labels = np.array([-1, -1, -1, 0, 0, 0, 1, 1, 1, 1], dtype=np.int64)
+
+    with pytest.raises(ValueError, match="SeriesGrouper requires non-empty `series`"):


Is this visible from an end user perspective?

this should never be raised to the user, just seemed like the behavior change merited a test.

…ster-ngroups3

jreback · 2019-11-12T19:27:46Z

lgtm. can you merge master here and merge on green.

…ster-ngroups3

jbrockmendel · 2019-11-12T21:25:32Z

rebased+green; ping.

jreback · 2019-11-12T23:08:58Z

thanks @jbrockmendel

jbrockmendel added 3 commits November 8, 2019 16:33

pre-empt ValueError by checking input length

b65b373

Merge branch 'master' of https://github.com/pandas-dev/pandas into fa…

430de63

…ster-ngroups3

dummy ot force ci

c255850

gfyoung added Error Reporting Incorrect or improved errors from pandas Internals Related to non-user accessible pandas implementation labels Nov 9, 2019

simonjayhawkins reviewed Nov 9, 2019

View reviewed changes

pandas/tests/groupby/test_bin_groupby.py Outdated Show resolved Hide resolved

pandas/_libs/reduction.pyx Outdated Show resolved Hide resolved

pandas/tests/groupby/test_bin_groupby.py Outdated Show resolved Hide resolved

gfyoung reviewed Nov 9, 2019

View reviewed changes

pandas/core/groupby/ops.py Outdated Show resolved Hide resolved

gfyoung reviewed Nov 9, 2019

View reviewed changes

pandas/tests/groupby/test_bin_groupby.py Show resolved Hide resolved

jbrockmendel added 2 commits November 9, 2019 09:45

update per comments

06bac18

update per comments

e4e9620

gfyoung approved these changes Nov 9, 2019

View reviewed changes

WillAyd requested changes Nov 10, 2019

View reviewed changes

jbrockmendel added 2 commits November 10, 2019 08:55

Merge branch 'master' of https://github.com/pandas-dev/pandas into fa…

de5ecd7

…ster-ngroups3

change check to assertion

e02ffd2

jreback added this to the 1.0 milestone Nov 12, 2019

Merge branch 'master' of https://github.com/pandas-dev/pandas into fa…

38067ed

…ster-ngroups3

jreback merged commit ab9dca0 into pandas-dev:master Nov 12, 2019

jbrockmendel deleted the faster-ngroups3 branch November 12, 2019 23:18

Reksbril pushed a commit to Reksbril/pandas that referenced this pull request Nov 18, 2019

REF: Pre-empt ValueError in _aggregate_series_fast (pandas-dev#29500)

d57109f

proost pushed a commit to proost/pandas that referenced this pull request Dec 19, 2019

REF: Pre-empt ValueError in _aggregate_series_fast (pandas-dev#29500)

c4ba849

proost pushed a commit to proost/pandas that referenced this pull request Dec 19, 2019

REF: Pre-empt ValueError in _aggregate_series_fast (pandas-dev#29500)

d253e02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

REF: Pre-empt ValueError in _aggregate_series_fast #29500

REF: Pre-empt ValueError in _aggregate_series_fast #29500

jbrockmendel commented Nov 9, 2019

simonjayhawkins left a comment

WillAyd Nov 10, 2019

jbrockmendel Nov 10, 2019

WillAyd Nov 10, 2019

jbrockmendel Nov 10, 2019

jreback commented Nov 12, 2019

jbrockmendel commented Nov 12, 2019

jreback commented Nov 12, 2019

REF: Pre-empt ValueError in _aggregate_series_fast #29500

REF: Pre-empt ValueError in _aggregate_series_fast #29500

Conversation

jbrockmendel commented Nov 9, 2019

simonjayhawkins left a comment

Choose a reason for hiding this comment

WillAyd Nov 10, 2019

Choose a reason for hiding this comment

jbrockmendel Nov 10, 2019

Choose a reason for hiding this comment

WillAyd Nov 10, 2019

Choose a reason for hiding this comment

jbrockmendel Nov 10, 2019

Choose a reason for hiding this comment

jreback commented Nov 12, 2019

jbrockmendel commented Nov 12, 2019

jreback commented Nov 12, 2019