-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
FIX: add support for desc order when ranking infs with nans #19538 #20091
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hello @peterpanmj! Thanks for updating the PR. Cheers ! There are no PEP8 issues in this Pull Request. 🍻 Comment last updated on March 25, 2018 at 10:11 Hours UTC |
Codecov Report
@@ Coverage Diff @@
## master #20091 +/- ##
==========================================
- Coverage 91.72% 91.7% -0.03%
==========================================
Files 150 150
Lines 49149 49149
==========================================
- Hits 45083 45071 -12
- Misses 4066 4078 +12
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #20091 +/- ##
==========================================
- Coverage 91.84% 91.82% -0.02%
==========================================
Files 152 152
Lines 49259 49249 -10
==========================================
- Hits 45241 45225 -16
- Misses 4018 4024 +6
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add a whatsnew note
pandas/tests/series/test_rank.py
Outdated
@@ -263,8 +264,11 @@ def test_rank_tie_methods_on_infs_nans(self): | |||
chunk = 3 | |||
disabled = set([('object', 'first')]) | |||
|
|||
def _check(s, expected, method='average', na_option='keep'): | |||
result = s.rank(method=method, na_option=na_option) | |||
def _check(s, expected, method='average', na_option='keep', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you parametrize this test
pandas/tests/series/test_rank.py
Outdated
_check(iseries, order, method, na_opt) | ||
order = [ranks[0], [np.nan] * chunk, ranks[1]] | ||
_check(iseries, order, method, na_opt, True) | ||
_check(iseries, order[::-1], method, na_opt, False) | ||
|
||
def test_rank_methods_series(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ca you change this to use the @td.skip_if_no_scipy
decorator instead
can you add the simple example from the issue as a test as well. |
The Travis CI build failed after some modification on "pandas\tests\series\test_rank.py". I have no idea what is the cause. |
somehow lots of commits go there. merge in master and push again. |
pandas/tests/series/test_rank.py
Outdated
exp_ranks = { | ||
'average': ([2, 2, 2], [5, 5, 5], [8, 8, 8]), | ||
'min': ([1, 1, 1], [4, 4, 4], [7, 7, 7]), | ||
'max': ([3, 3, 3], [6, 6, 6], [9, 9, 9]), | ||
'first': ([1, 2, 3], [4, 5, 6], [7, 8, 9]), | ||
'dense': ([1, 1, 1], [2, 2, 2], [3, 3, 3]) | ||
} | ||
na_options = ('top', 'bottom', 'keep') | ||
|
||
def _check(s, method, na_option, ascending): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you inline these below
pandas/tests/series/test_rank.py
Outdated
_check(iseries, method, na_option, ascending) | ||
|
||
def test_rank_desc_mix_nans_infs(self): | ||
iseries = Series([1, np.nan, np.inf, -np.inf, 25]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add the issue number
, add some comments
lint.sh gives error. Don't know why. |
pandas/tests/series/test_rank.py
Outdated
result = s.rank(method=method, na_option=na_option) | ||
def _check(s, method, na_option, ascending): | ||
exp_ranks = { | ||
'average': ([2, 2, 2], [5, 5, 5], [8, 8, 8]), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These should start four spaces to the right of the e
in exp_ranks
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likewise with each one below it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added comments on the linking errors in https://travis-ci.org/pandas-dev/pandas/jobs/353646265#L3004
Run flake8 pandas/tests/series/test_rank.py
locally before pushing to validate your fixes.
pandas/tests/series/test_rank.py
Outdated
result = s.rank(method=method, na_option=na_option) | ||
def _check(s, method, na_option, ascending): | ||
exp_ranks = { | ||
'average': ([2, 2, 2], [5, 5, 5], [8, 8, 8]), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likewise with each one below it.
pandas/tests/series/test_rank.py
Outdated
_check(iseries, method, na_option, ascending) | ||
|
||
def test_rank_desc_mix_nans_infs(self): | ||
#GH 19538 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Space after #
pandas/tests/series/test_rank.py
Outdated
|
||
def test_rank_desc_mix_nans_infs(self): | ||
#GH 19538 | ||
#check descending ranking when mix nans and infs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Space after #
@WillAyd if you'd have a look |
@peterpanmj nice work - does this work for GroupBy ranking the same way? |
@WillAyd As long as rank_1d_{dtype} is called, it is fine, but group by rank is still broken.
|
Do you want to try and tackle the groupby implementation as part of this? I've linked the relevant function below - it isn't that drastically different from the algos rank implementation pandas/pandas/_libs/groupby_helper.pxi.in Line 415 in 766a480
|
I will try but it should take a lot of efforts. |
Labels are int64 - each unique grouping will have it's own label. out has |
can do the groupby fixes in another PR :> @peterpanmj or @WillAyd can you open an issue? |
thanks @peterpanmj |
Opened #20561 for the GroupBy piece. @peterpanmj if you started working on it then by all means continue and let me know if you need help |
Please include the output of the validation script below between the "```" ticks:
Checklist for other PRs (remove this part if you are doing a PR for the pandas documentation sprint):
git diff upstream/master -u -- "*.py" | flake8 --diff