-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Fix makeIntIndex, benchmark get loc #19483
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -206,9 +207,9 @@ def test_scalar_integer(self): | |||
# test how scalar float indexers work on int indexes | |||
|
|||
# integer index | |||
for index in [tm.makeIntIndex, tm.makeRangeIndex]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rather than change this, wouldn't it be better to simply fix the make*
ones? this is pretty much an anti-patter and the reason we have the helpers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only way to "fix" makeIntIndex
for this test to work is to leave it as it is, but as I stated, it limits comparability across index types, since others return unsorted stuff.
This test does something different than using the makeIntIndex
to get "some index filled with integers": it assumes that the result has a specific content. I see this as an anti-pattern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are you removing testing of RangeIndex? (yes you convert to it, but its not very explicit)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's just simpler, but OK, I can make it explicit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(done), ping
Codecov Report
@@ Coverage Diff @@
## master #19483 +/- ##
==========================================
- Coverage 91.67% 91.67% -0.01%
==========================================
Files 148 148
Lines 48543 48541 -2
==========================================
- Hits 44502 44499 -3
- Misses 4041 4042 +1
Continue to review full report at Codecov.
|
pandas/util/testing.py
Outdated
@@ -1560,7 +1560,9 @@ def makeBoolIndex(k=10, name=None): | |||
|
|||
|
|||
def makeIntIndex(k=10, name=None): | |||
return Index(lrange(k), name=name) | |||
if k == 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... because otherwise the index comes with an element anyway. But I'll reformulate
@@ -206,9 +207,9 @@ def test_scalar_integer(self): | |||
# test how scalar float indexers work on int indexes | |||
|
|||
# integer index | |||
for index in [tm.makeIntIndex, tm.makeRangeIndex]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are you removing testing of RangeIndex? (yes you convert to it, but its not very explicit)
@@ -1560,11 +1560,11 @@ def makeBoolIndex(k=10, name=None): | |||
|
|||
|
|||
def makeIntIndex(k=10, name=None): | |||
return Index(lrange(k), name=name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are you changing this? sure this is an int index, but the ordering is not expected. Revert this, and if you really need it then make a fixture in the appropriate places
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the ordering is not expected.
Why do you expect an index of int (and not of float, str) to be sorted?
What don't you understand in the description of this PR?
Sure, I can write all the fixtures you like. It just doesn't make any sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because for testing using something like arange is the expected input.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this has nothing to do with the description. you are free to use sorted, unique, not-sorted whatever in a test, but that should be explicit. you are changing a global default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because for testing using something like arange is the expected input.
If you expect that, fine to me: shall I then change makeFloatIndex
and makeStringIndex
to return sorted indexes, so that when I shuffle them it is "explicit" that they are unsorted? I couldn't care less if standard test indexes are sorted or not, as long as they all are and I don't need to add useless lines of codes in tests.
Currently not only it's not "explicit" if tests run on sorted indexes or not - it actually depends on the type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general when you run a test on different types you want to change only the type. Not other characteristics of the index. Then sure, there are some unavoidable differences (e.g. a RangeIndex
is sorted)... but apart from that, they should be as comparable as possible, so that tests can really test the different code paths.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because for testing using something like arange is the expected input.
Actually, the concept itself of "expected input" (referring to content) is completely wrong in a test suite.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls do it my way. you are changing a function that is public facing for no real reason.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls do it my way. you are changing a function that is public facing for no real reason.
Oh come on Jeff, the function is clearly not part of the API. I gave "real reasons", you didn't. I even proposed an alternative, which you didn't even comment... you're certainly not going to convince me by asking "pls do it my way".
By the way, you are a core dev, you can always merge and then change (e.g. the new asv tests), so if you don't like my PR and don't like to discuss... you don't actually need to waste your time here, just merge and fix to your liking.
@@ -1560,11 +1560,11 @@ def makeBoolIndex(k=10, name=None): | |||
|
|||
|
|||
def makeIntIndex(k=10, name=None): | |||
return Index(lrange(k), name=name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls do it my way. you are changing a function that is public facing for no real reason.
@jreback You realize that the asv test is broken in that it never tests unsorted |
and that change should be in the asv itself |
fine |
@jreback if you make changes to a PR before merging (which is OK, I also do that from time to time), please push to this branch on github and merge here, otherwise it is really un-transparent to see what has been merged (the merge script has even an option to do this). |
Author: Pietro Battiston <[email protected]> Closes pandas-dev#19483 from toobaz/test_get_loc and squashes the following commits: 51d6911 [Pietro Battiston] TST: benchmark get_loc in various cases d424f63 [Pietro Battiston] TST: produce unsorted integer index (consistently with other types)
git diff upstream/master -u -- "*.py" | flake8 --diff
makeIntIndex
currently produces a monotonic index, and hence tests common to different classes actually test very different things (i.e. the sorted vs. unsorted code paths). I had to fix this (and to fix the tests which erroneously relied on the specific content ofmakeIntIndex(k)
) in order to add meaningful performance tests forget_loc
.