-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
CLN: Assorted typings #28604
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CLN: Assorted typings #28604
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -26,7 +26,7 @@ | |
_default_hash_key = "0123456789123456" | ||
|
||
|
||
def _combine_hash_arrays(arrays, num_items): | ||
def _combine_hash_arrays(arrays, num_items: int): | ||
""" | ||
Parameters | ||
---------- | ||
|
@@ -55,7 +55,11 @@ def _combine_hash_arrays(arrays, num_items): | |
|
||
|
||
def hash_pandas_object( | ||
obj, index=True, encoding="utf8", hash_key=None, categorize=True | ||
obj, | ||
index: bool = True, | ||
encoding: str = "utf8", | ||
hash_key=None, | ||
categorize: bool = True, | ||
): | ||
""" | ||
Return a data hash of the Index/Series/DataFrame. | ||
|
@@ -125,7 +129,10 @@ def hash_pandas_object( | |
for _ in [None] | ||
) | ||
num_items += 1 | ||
hashes = itertools.chain(hashes, index_hash_generator) | ||
|
||
# keep `hashes` specifically a generator to keep mypy happy | ||
_hashes = itertools.chain(hashes, index_hash_generator) | ||
hashes = (x for x in _hashes) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. you can also do a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Our type hints policy is "..., the use of cast is strongly discouraged. Where applicable a refactor of the code to appease static analysis is preferable". https://dev.pandas.io/docs/development/contributing.html#style-guidelines personally I'm not a fan of changing the runtime behavior of code just to appease mypy if not absolutely necessary unless it makes for cleaner code. (This applies to both our policy and the changes in this PR) The problem in this function is because Mypy considers the initial assignment as the definition of a variable. https://mypy.readthedocs.io/en/stable/type_inference_and_annotations.html#type-inference which occurs on L114. it is possible to override the inferred type of a variable by using a variable type annotation https://mypy.readthedocs.io/en/stable/type_inference_and_annotations.html#explicit-types-for-variables This is the approach I took to silence this mypy error in 304351e. I prefer this as it has no effect on the runtime behavior. with regard to the use of a leading underscore to avoid There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It looks like the main actionable suggestion here is to change |
||
h = _combine_hash_arrays(hashes, num_items) | ||
|
||
h = Series(h, index=obj.index, dtype="uint64", copy=False) | ||
|
@@ -179,7 +186,7 @@ def hash_tuples(vals, encoding="utf8", hash_key=None): | |
return h | ||
|
||
|
||
def hash_tuple(val, encoding="utf8", hash_key=None): | ||
def hash_tuple(val, encoding: str = "utf8", hash_key=None): | ||
""" | ||
Hash a single tuple efficiently | ||
|
||
|
@@ -201,7 +208,7 @@ def hash_tuple(val, encoding="utf8", hash_key=None): | |
return h | ||
|
||
|
||
def _hash_categorical(c, encoding, hash_key): | ||
def _hash_categorical(c, encoding: str, hash_key: str): | ||
""" | ||
Hash a Categorical by hashing its categories, and then mapping the codes | ||
to the hashes | ||
|
@@ -239,7 +246,7 @@ def _hash_categorical(c, encoding, hash_key): | |
return result | ||
|
||
|
||
def hash_array(vals, encoding="utf8", hash_key=None, categorize=True): | ||
def hash_array(vals, encoding: str = "utf8", hash_key=None, categorize: bool = True): | ||
""" | ||
Given a 1d array, return an array of deterministic integers. | ||
|
||
|
@@ -317,7 +324,7 @@ def hash_array(vals, encoding="utf8", hash_key=None, categorize=True): | |
return vals | ||
|
||
|
||
def _hash_scalar(val, encoding="utf8", hash_key=None): | ||
def _hash_scalar(val, encoding: str = "utf8", hash_key=None): | ||
""" | ||
Hash scalar value | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
adding type hints here is causing
pandas/core/util/hashing.py:132: error: Incompatible types in assignment (expression has type "chain[Any]", variable has type "Generator[Any, None, None]")
further down as the body is now checked.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it looks like the conda environment now has mypy 0.720. result!
fix is here 304351e is you want to include in the PR. (can't use py3.6 variable annotations though.)