-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
CLN: Simplify map_infer_mask #58483
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CLN: Simplify map_infer_mask #58483
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good generally
@@ -627,28 +627,25 @@ def _str_map( | |||
na_value = np.nan | |||
else: | |||
na_value = False | |||
try: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you're free, could you review the changes to the Arrow strings here?
I think what I have here is correct, but not too familiar with the arrow strings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pandas/_libs/lib.pyx
Outdated
*, | ||
bint convert=True, | ||
object na_value=no_default, | ||
object dtype=np.dtype(object) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why this change? is the new numpy string dtype not an instance of cnp.dtype?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be an accidental change I forgot to roll back.
In general, is there a reason we prefer cnp.dtype
?
With cnp.dtype
I find it harder to do stuff like kind checks on the dtype object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, is there a reason we prefer cnp.dtype?
Just the usual "explicit is better than implicit" reasons; if it is object
here then in 6 months I'll have to check whether it can be an ExtensionDtype
pandas/_libs/lib.pyx
Outdated
bint convert=True, | ||
object na_value=no_default, | ||
object dtype=np.dtype(object) | ||
) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
returning None looks inaccurate. am i reading this wrong?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry i mixed up the annotations from the old _map_infer_mask
Thanks @lithomas1 |
doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.This is needed for upcoming numpy string dtype support (getting rid of the
ndarray[object]
annotation).We might also get a decent speedup from using numpy C API functions (but probably not).
EDIT: Looks like benchmarks aren't significantly changed.