-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Consolidate nth / last object Groupby Implementations #19610
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -360,11 +361,19 @@ def group_last_{{name}}(ndarray[{{dest_type2}}, ndim=2] out, | |||
raise AssertionError("len(index) != len(labels)") | |||
|
|||
nobs = np.zeros((<object> out).shape, dtype=np.int64) | |||
{{if name=='object'}} | |||
resx = np.empty((<object> out).shape, dtype=object) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was really hoping to not even have this conditional, but when trying resx = np.empty_like(out)
and even resx = np.empty_like(out, dtype='object')
it kept SegFaulting on objects
Codecov Report
@@ Coverage Diff @@
## master #19610 +/- ##
==========================================
- Coverage 91.62% 91.59% -0.03%
==========================================
Files 150 150
Lines 48790 48790
==========================================
- Hits 44703 44691 -12
- Misses 4087 4099 +12
Continue to review full report at Codecov.
|
pandas/tests/groupby/test_groupby.py
Outdated
('last', [('bar', 'grault'), ('foo', 'quux')], []), | ||
('nth', [('bar', 'corge'), ('foo', 'qux')], [1]), | ||
]) | ||
def test_groupby_get_nth_object(self, method, exp, args): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think the nth tests are in another test file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm. going to merge the other PR first.
@@ -2086,47 +2086,45 @@ def test_median_empty_bins(self): | |||
expected = df.groupby(bins).agg(lambda x: x.median()) | |||
assert_frame_equal(result, expected) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a tests with the numeric ops on groupby / object and assert they raise
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming you want this in a different change no problem. That said, do you consider the numeric ops to be add
, prod
, min
, max
, mean
, median
, var
, ohlc
, cumprod
, cumsum
, cummin
, cummax
and rank
? The tests are one thing, but these don't all raise at the moment so would have to couple that with some refactoring of the groupby
module
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes let's do this in another change.
thanks! |
@WillAyd I merged this, then reverted it. Can you send a new PR that is rebased on master? |
Reverted in 5c76f33 |
…-dev#19610)" This reverts commit d4730e6.
git diff upstream/master -u -- "*.py" | flake8 --diff
Depending on what goes first between this and #19481 some merge conflicts will need to be cleaned up