Skip to content

BUG: raise on non-hashable in __contains__ #30902

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Jan 20, 2020

Conversation

jbrockmendel
Copy link
Member

By checking before we get to the Engine.__contains__ call, we can avoid a redundant call to hash

Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@WillAyd WillAyd added the Index Related to the Index class or subclasses label Jan 10, 2020
Copy link
Member

@simonjayhawkins simonjayhawkins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jbrockmendel the issue title includes BUG, is there a related issue?

@@ -409,7 +408,9 @@ cdef class DatetimeEngine(Int64Engine):
cdef _get_box_dtype(self):
return 'M8[ns]'

def __contains__(self, object val):
def __contains__(self, val: object) -> bool:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

@@ -717,7 +718,9 @@ cdef class BaseMultiIndexCodesEngine:

return indexer

def __contains__(self, object val):
def __contains__(self, val: object) -> bool:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same

@@ -390,6 +390,7 @@ def __contains__(self, key) -> bool:
if is_scalar(key) and isna(key):
return self.hasnans

hash(key)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this is to check that key is hashable. can you not type key as Hashable?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's not a run-time check

@jbrockmendel
Copy link
Member Author

mypy didnt like my attempt at annotating, so reverted

@jbrockmendel jbrockmendel added the Indexing Related to indexing on series/frames, not to indexes themselves label Jan 11, 2020
@simonjayhawkins
Copy link
Member

mypy didnt like my attempt at annotating, so reverted

is this because __contains__ is a special method and this code is unsafe. Is this not also a breaking change?

@jbrockmendel
Copy link
Member Author

restored annotations. turns out the trouble was that RangeIndex.__contains__ had a non-matching annotation

@simonjayhawkins
Copy link
Member

restored annotations. turns out the trouble was that RangeIndex.__contains__ had a non-matching annotation

I think mypy should only complain if the class is a subclass of collections.abc.Container
even so, in typeshed, classes that want to restrict the type of key add a type:ignore.

so rather than Any, can Hashable be used?

@jbrockmendel
Copy link
Member Author

so rather than Any, can Hashable be used?

When I use Optional[Hashable] mypy gives:

pandas/core/indexes/numeric.py:236: error: No overload variant of "int" matches argument type "Optional[Hashable]"
pandas/core/indexes/numeric.py:236: note: Possible overload variant:
pandas/core/indexes/numeric.py:236: note:     def int(self, x: Union[str, bytes, SupportsInt] = ...) -> int
pandas/core/indexes/numeric.py:236: note:     <1 more non-matching overload not shown>
pandas/core/indexes/numeric.py:488: error: Argument 1 to "len" has incompatible type "Optional[Hashable]"; expected "Sized"
pandas/core/indexes/numeric.py:488: error: Item "Hashable" of "Optional[Hashable]" has no attribute "item"
pandas/core/indexes/numeric.py:488: error: Item "None" of "Optional[Hashable]" has no attribute "item"
pandas/core/indexes/numeric.py:490: error: Argument 1 to "len" has incompatible type "Optional[Hashable]"; expected "Sized"
Found 5 errors in 1 file (checked 846 source files)

@WillAyd
Copy link
Member

WillAyd commented Jan 17, 2020

Yea I guess might be tough to get that to work with mypy since we are catching the TypeError at runtime, though passing a non-hashable item to that would currently raise an error at the hash() call that is uncaught, so I think mypy is highlighting an actual bug in a roundabout way

@jreback jreback added this to the 1.1 milestone Jan 18, 2020
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks fine. can you add assert hash(key) in the cython code?

@jbrockmendel
Copy link
Member Author

looks fine. can you add assert hash(key) in the cython code?

A big part of the motivation here is to avoid duplicated hash calls

@jreback
Copy link
Contributor

jreback commented Jan 18, 2020

looks fine. can you add assert hash(key) in the cython code?

A big part of the motivation here is to avoid duplicated hash calls

do asserts not get stripped? isn't that the point? (its not a big deal as long as you have a comment, but the asserts are to provide developer guarantees)

also needs a rebase

Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice clean-up!

@jreback jreback merged commit 06e416d into pandas-dev:master Jan 20, 2020
@jreback
Copy link
Contributor

jreback commented Jan 20, 2020

thanks @jbrockmendel

@jbrockmendel jbrockmendel deleted the cln-contains branch January 20, 2020 19:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Index Related to the Index class or subclasses Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants