-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
ERR: Consistent errors for non-numeric ranking. (#19560) #20670
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
it looks like another commit is included here. can you rebase on master. |
Codecov Report
@@ Coverage Diff @@
## master #20670 +/- ##
==========================================
- Coverage 91.84% 91.81% -0.04%
==========================================
Files 153 153
Lines 49275 49277 +2
==========================================
- Hits 45255 45242 -13
- Misses 4020 4035 +15
Continue to review full report at Codecov.
|
Rebased and added a whatsnew entry @jreback. |
@@ -418,6 +418,8 @@ Other Enhancements | |||
Backwards incompatible API changes | |||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | |||
|
|||
- Using :func:`DataFrame.rank` on a data frame with non-numeric entries other than ordered categoricals will raise a ValueError. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move to the list of issues. use double-backticks around ValueError
.
'data frame' -> DataFrame
raise ValueError("pandas.core.algorithms.rank " | ||
"not supported for unordered " | ||
"non-numeric data") | ||
if is_categorical_dtype(values): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could be simpler, maybe
if is_object_dtype(values) and not (is_categorical_dtype(values) and values.ordered):
raise ...
in the error message
just say ".rank().format(type(value).__name__)"
@@ -71,23 +71,22 @@ def test_rank2(self): | |||
result = df.rank(0, pct=True) | |||
tm.assert_frame_equal(result, expected) | |||
|
|||
df = DataFrame([['b', 'c', 'a'], ['a', 'c', 'b']]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you make this a separate test and move all of the cases. Ideally you could parameterize them
@@ -218,7 +217,7 @@ def test_rank_methods_frame(self): | |||
tm.assert_frame_equal(result, expected) | |||
|
|||
def test_rank_descending(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a test with assert that the object ones raise (unless it duplicates too much the above tests)
results = self.results | ||
|
||
for method, axis, dtype in product(results, [0, 1], dtypes): | ||
if (dtype, method) in disabled: | ||
if dtype == object: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use is_object_dtype
@@ -134,22 +134,27 @@ def test_rank_categorical(self): | |||
assert_series_equal(ordered.rank(), exp) | |||
assert_series_equal(ordered.rank(ascending=False), exp_desc) | |||
|
|||
# Unordered categoricals should be ranked as objects |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above
this can fully close the issue. e.g. the issue is about raising an error, not actually supporting this. |
"not supported for unordered " | ||
"non-numeric data") | ||
|
||
# Ranking unordered categorials depreciated per #19560 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer to say "not supported" instead of "deprecated"
assert_series_equal(res1, exp_unordered1) | ||
|
||
# Won't raise ValueError because entries not objects. | ||
unordered1.rank() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this right? Certainly the example passes the "eye test" but if the Categorical is not ordered what semantics are we using for ranking?
@@ -379,9 +375,15 @@ def test_rank_int(self): | |||
def test_rank_object_bug(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Name of this test can be changed to test_rank_na_object_raises
can you rebase |
closing as stale. if you'd like to continue, pls ping. |
git diff upstream/master -u -- "*.py" | flake8 --diff
*This is only partial solution to issue #19560.
*There were some errors with the tests, but I guess they are unrelated to these changes since I also
have them with master.
*I modified some tests so that they don't contradict with the update "don't allow objects to be ranked unless they are ordered categorials" that was suggested in #19560.