Skip to content

BUG: #10645 in using MultiIndex.__contains__ #10675

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

scari
Copy link
Contributor

@scari scari commented Jul 26, 2015

This PR fix a BUG #10645
@sinhrks would you review my PR?

if util.get_value_at(values, loc) != val:
try:
if util.get_value_at(values, loc) != val:
raise KeyError(val)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this just papers over things and does not solve the issue

@jreback
Copy link
Contributor

jreback commented Jul 26, 2015

needs tests

@scari
Copy link
Contributor Author

scari commented Jul 26, 2015

Thanks for mentioning about missing test. Just added. :)

@scari
Copy link
Contributor Author

scari commented Jul 27, 2015

I revisit this issue and Yes, you're right @jreback I did further investigation and it seems a odd behavior of Cython. so this PR doesn't solve the issue.

I'd like to finish this issue. Do you have any idea with this? Anything would be appreciated.

@jreback
Copy link
Contributor

jreback commented Jul 27, 2015

so the indexes have different behavior if > 1M elements on lookups. so that is a different path that is taken.

@scari
Copy link
Contributor Author

scari commented Jul 27, 2015

This isn't Cython issue.

https://github.com/pydata/pandas/blob/master/pandas/index.pyx#L141-L148
If > 1M elements on lookups, IndexEngine.get_loc() do _bin_search() and try to get_value_at() the loc.
If the loc is equal to len(values) (like this error case), it looks ok to raise IndexError in util.get_value_at() because it means the loc is out of bounds.

So we should handle this IndexError somewhere.
IMHO, get_loc() itself has a 'key' parameter and if it fail key lookup, it's okay to raise KeyError here.

How's your thought? @jreback

@scari
Copy link
Contributor Author

scari commented Jul 29, 2015

This PR also fixes #10692
@jreback Any comments would be appreciated.

@sinhrks
Copy link
Member

sinhrks commented Jul 29, 2015

@ scari Gr8! Could you add test cases and release note for #10692? Then, pls squash to a single commit.

@jreback
Copy link
Contributor

jreback commented Jul 29, 2015

this is not correct
you are fixing a symptom not the problem
I don't think the indexer is setup correctly

@scari
Copy link
Contributor Author

scari commented Jul 29, 2015

Okay. I'll take another look at this.

@kawochen
Copy link
Contributor

kawochen commented Sep 8, 2015

@scari I'll submit a PR if you don't have time to come back to this.

@jreback jreback added Bug Indexing Related to indexing on series/frames, not to indexes themselves MultiIndex labels Sep 10, 2015
@jreback
Copy link
Contributor

jreback commented Sep 10, 2015

replaced by #11049

@jreback jreback closed this Sep 10, 2015
@scari
Copy link
Contributor Author

scari commented Oct 12, 2015

@kawochen Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves MultiIndex
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants