-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
CLN: Verify non-ndarray is an ExtensionArray in hash_array GH42003 #42008
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hello @TheNeuralBit! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found: There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 Comment last updated at 2021-06-14 20:31:18 UTC |
092c0aa
to
e33b1e5
Compare
@@ -286,6 +286,8 @@ def hash_array( | |||
return _hash_categorical(vals, encoding, hash_key) | |||
elif not isinstance(vals, np.ndarray): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i would rather explictily check for an extension array here.
also pls add a test for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's already
if not hasattr(vals, "dtype"):
raise TypeError("must pass a ndarray-like")
at the top of the function for validation. Could maybe change that to check for EA/np.ndarray explicitly and change the message to be more user friendly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep agreed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
importing ExtensionArray
so I can use it in an isinstance check is resulting in a circular import:
❯ python -m pytest pandas/tests/util/test_hashing.py
ImportError while loading conftest '.../pandas/pandas/conftest.py'.
pandas/__init__.py:22: in <module>
from pandas.compat import (
pandas/compat/__init__.py:15: in <module>
from pandas.compat.numpy import (
pandas/compat/numpy/__init__.py:7: in <module>
from pandas.util.version import Version
pandas/util/__init__.py:7: in <module>
from pandas.core.util.hashing import ( # noqa
pandas/core/util/hashing.py:23: in <module>
from pandas.core.arrays.base import ExtensionArray
pandas/core/arrays/__init__.py:1: in <module>
from pandas.core.arrays.base import (
pandas/core/arrays/base.py:32: in <module>
from pandas.compat import set_function_name
E ImportError: cannot import name 'set_function_name' from partially initialized module 'pandas.compat' (most likely due to a circular import) (.../pandas/pandas/compat/__init__.py)
Any advice on how to resolve this? I could just move the import inside hash_array
, but not sure if that's consistent with pandas coding conventions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we have a is_array_like
in pandas/core/dtypes/inference.py
that also includes Index
and Series
so is not suitable here.
Most public functions that accept an numpy array/EA probably accept a array-like even if not documented, like this one used to.
It is mainly internal functions where we would be more strict on the types accepted.
So we could maybe add a is_array_like_strict
but we would normally already have validated the arguments that are passed internally so may not have much utility outside this public method which is more restrictive than most.
For now, I think OK to move the imports inside hash_array
until we know we want to do this type of check elsewhere.
This pull request is stale because it has been open for thirty days with no activity. Please update or respond to this comment if you're still interested in working on this. |
Thanks for the PR, but it appears it has gone stale and needs to address the review. Closing but happy to reopen if you're still interested in continuing. |
Closes #42003