Skip to content

ENH: Hand numpy-like arrays with is_list_like #39830

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 30 commits into from

Conversation

znicholls
Copy link
Contributor

@znicholls znicholls commented Feb 16, 2021

@znicholls
Copy link
Contributor Author

znicholls commented Feb 16, 2021

@jreback let's try this again. This time just updating the single is_list_like. Such a change allows Pint EA to pass many more tests, as is shown by the removal of @pytest.mark.xfail(run=True, reason="__iter__ / __len__ issue") in multiple places in https://github.com/hgrecco/pint-pandas/pull/59/files.

Fixes to is_scalar will also have to follow but leaving that for a separate PR is probably simplest.

How can I tell how big the performance hit is?

@jbrockmendel
Copy link
Member

How can I tell how big the performance hit is?

for something like this usually do %timeit is_list_like(foo) with a few different types of inputs, and report the results on both master and the PR

@znicholls
Copy link
Contributor Author

znicholls commented Feb 16, 2021

Ok

In [1]: import pandas as pd                                                                                                                                                     

In [2]: import numpy as np                                                                                                                                                      

In [3]: a1 = [1, 2, 3] 
   ...: a2 = np.array([1, 2, 3]) 
   ...: a3 = np.array([1, 2, 3])[0]                                                                                                                                             

In [4]: %timeit pd.api.types.is_list_like(a1)                                                                                                                                   
412 ns ± 22 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)  # master
624 ns ± 15.3 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)  # pr

In [5]: %timeit pd.api.types.is_list_like(a2)                                                                                                                                   
415 ns ± 20.4 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)  # master
416 ns ± 13.3 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)  # pr

In [6]: %timeit pd.api.types.is_list_like(a3)                                                                                                                                   
485 ns ± 140 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)  # master
446 ns ± 120 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)  # pr

Almost identical results to #35127 (comment).

Using np.iterable gives a third option where things are faster for lists and arrays but slower for scalars (see #35127 (comment)).

If we end up back at this comment #35127 (comment) then we are simply going in circles between that comment and this comment #35127 (comment)

@jreback
Copy link
Contributor

jreback commented Feb 17, 2021

FYI #39852 which changes the bogey. Not sure if the c-function is better or worse for you.

rhshadrach and others added 22 commits February 17, 2021 12:36
Co-authored-by: JHM Darbyshire (MBP) <[email protected]>
@znicholls
Copy link
Contributor Author

Not sure if the c-function is better or worse for you.

Thanks, that helps for sure (list slightly slower but might also be because I'm running lots of stuff at the moment...). I think we can include this without performance hits after #39852 then.

@znicholls znicholls changed the title Update is_list_like ENH: Hand numpy-like arrays with is_list_like Feb 17, 2021
@github-actions
Copy link
Contributor

This pull request is stale because it has been open for thirty days with no activity. Please update or respond to this comment if you're still interested in working on this.

@github-actions github-actions bot added the Stale label Mar 20, 2021
@simonjayhawkins
Copy link
Member

@znicholls Thanks for the PR. closing as stale. ping if you want to continue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.