Skip to content

TYP: get_reverse_indexer, get_group_index_sorter #40476

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Mar 20, 2021

Conversation

jbrockmendel
Copy link
Member

nailing down all the "np.intp"s bit by little bit.

AFAICT we're not going to be able to annotate ndarrays with ndim/dtype any time soon. Might be worth looking into https://pypi.org/project/nptyping/

@simonjayhawkins simonjayhawkins added the Typing type annotations, mypy/pyright type checking label Mar 17, 2021
@jreback jreback added this to the 1.3 milestone Mar 17, 2021
@jreback jreback added the Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff label Mar 17, 2021
@jreback
Copy link
Contributor

jreback commented Mar 17, 2021

any perf changes?

@jbrockmendel
Copy link
Member Author

any perf changes?

might be some small gain on 32 bit, nothing on 64.

@@ -4197,17 +4206,19 @@ def _get_leaf_sorter(labels):
mask = new_lev_codes != -1
if not mask.all():
new_codes = [lab[mask] for lab in new_codes]
left_indexer = left_indexer[mask]
# error: Value of type "Optional[ndarray]" is not indexable
left_indexer = left_indexer[mask] # type: ignore[index]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

np.arange resolves to Any. https://github.com/numpy/numpy/blob/main/numpy/__init__.pyi

as a result left_indexer is not narrowed with left_indexer = np.arange(len(left), dtype=np.intp) (L4215 after merge master)

so I think in this case better to cast than add an ignore. (We have warn_redundant_casts = True in setup.cfg) so should be aware when the cast is no longer required)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated + green

)

# missing values are placed first; drop them!
left_indexer = left_indexer[counts[0] :]
# error: Value of type "Optional[ndarray]" is not indexable
left_indexer = left_indexer[counts[0] :] # type: ignore[index]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could maybe leave this one as a "fix later" ignore instead of a cast as we can fix ourselves by typing libalgos.groupsort_indexer (i.e. adding algos.pyi)

Copy link
Member

@simonjayhawkins simonjayhawkins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jbrockmendel for the updates

@jbrockmendel
Copy link
Member Author

gentle ping; the intp_t stuff (this, #40475, #40474) is becoming blocker-ish for getting algos.pyi and groupby.pyi

@jreback
Copy link
Contributor

jreback commented Mar 19, 2021

yeah will look a bit later

@jreback jreback merged commit b519386 into pandas-dev:master Mar 20, 2021
@jbrockmendel jbrockmendel deleted the typ-get_reverse_indexer branch March 20, 2021 01:38
JulianWgs pushed a commit to JulianWgs/pandas that referenced this pull request Jul 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff Typing type annotations, mypy/pyright type checking
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants