-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Fixes grouped rank bug with nullable types. #54460
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 20 commits
41d4419
ce3b5d8
fc3ac9f
6a425f1
63faab6
41f4529
566cf2d
1551301
313219c
0a2d569
476a9b6
676d975
9a4215b
2b7ae59
091ce9f
18efb62
2251f7b
a7b8f79
5cf2b59
254d8d0
8961c61
31832f1
59d8b25
22e9239
65e1c45
1e22b7e
85627e6
e285efa
5a63a11
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -1492,6 +1492,9 @@ def _groupby_op( | |
else: | ||
result_mask = np.zeros(ngroups, dtype=bool) | ||
|
||
if op.how == "rank" and kwargs.get("na_option") in ["top", "bottom"]: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: can just use |
||
result_mask[:] = False | ||
|
||
res_values = op._cython_op_ndim_compat( | ||
self._data, | ||
min_count=min_count, | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -710,3 +710,14 @@ def test_rank_categorical(): | |
|
||
expected = df.astype(object).groupby("col1").rank() | ||
tm.assert_frame_equal(res, expected) | ||
|
||
|
||
@pytest.mark.parametrize("na_option", ["top", "bottom"]) | ||
def test_groupby_op_with_nullables(na_option): | ||
# GH 54206 | ||
df_ext = DataFrame({"x": [None]}, dtype="Float64") | ||
rhshadrach marked this conversation as resolved.
Show resolved
Hide resolved
|
||
result_ext = df_ext.groupby("x", dropna=False)["x"].rank( | ||
method="min", na_option=na_option | ||
) | ||
expected_result_ext = Series([1.0], dtype="float64", name=result_ext.name) | ||
tm.assert_series_equal(result_ext, expected_result_ext, check_dtype=False) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why the suffix There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also - why There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @rhshadrach sir, I made all the suggested changes. The reason for adding There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
If the issue with the dtype is There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @Sai-Suraj-27: Friendly ping here; I think this is the last bit and then we're all set to go. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @rhshadrach sir, Sorry i missed this notification somehow, just saw it again today! I have removed the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you do
:meth:`DataFrameGroupBy.rank`
here instead. Then no need to say "when using the rank function". This gives the reader a direct link to the method.