Skip to content

BUG: Fix regression in is_list_like #43373

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Sep 4, 2021

Conversation

aiudirog
Copy link
Contributor

@aiudirog aiudirog commented Sep 2, 2021

is_list_like() used to use isinstance(obj, abc.Iterable) before #39852 where it was optimized for performance. This resulted in a regression where objects that explicitly declare __iter__ as None are considered "list like" but not instances of abc.Iterable.

Because the original original PR was focused on performance, here are the same timeit cases run on this PR:

In[1]: import numpy as np
In[2]: from pandas._libs.lib import is_list_like
In[3]: arr = np.array(0)
In[4]: arr2 = np.array([0])

In[5]: %timeit is_list_like([])
51.3 ns ± 0.231 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)  <- master
56.3 ns ± 0.44 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)   <- PR
153 ns ± 2.4 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)     <- isinstance(obj, abc.Iterable)

In[6]: %timeit is_list_like(0)
117 ns ± 2.19 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)    <- master
126 ns ± 2 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)       <- PR
154 ns ± 1.18 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)    <- isinstance(obj, abc.Iterable)

In[7]: %timeit is_list_like(arr)
49.5 ns ± 0.156 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)  <- master
50.6 ns ± 0.14 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)   <- PR
152 ns ± 1.12 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)    <- isinstance(obj, abc.Iterable)

In[8]: %timeit is_list_like(arr2)
50.8 ns ± 1.36 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)   <- master
51.5 ns ± 0.179 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)  <- PR
151 ns ± 0.582 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)   <- isinstance(obj, abc.Iterable)
  • tests added / passed
  • Ensure all linting tests pass, see here for how to run them
  • whatsnew entry

is_list_like() used to use isinstance(obj, abc.Iterable) before pandas-dev#39852 where it was optimized for performance. This resulted in a regression where objects that explicitly declare __iter__ as None are considered "list like" but not instances of abc.Iterable.
@jreback jreback added this to the 1.4 milestone Sep 2, 2021
@jreback jreback added the Compat pandas objects compatability with Numpy or Python functions label Sep 2, 2021
@jreback
Copy link
Contributor

jreback commented Sep 2, 2021

no objection, ping on green (you can add a release not if you'd like), 1.4.

@aiudirog
Copy link
Contributor Author

aiudirog commented Sep 3, 2021

no objection, ping on green (you can add a release not if you'd like), 1.4.

Where do you think the best place to put the release note is? doc/sources/whatsnew/v1.4.0rst -> Bug Fixes -> Other?

@jbrockmendel
Copy link
Member

Bug Fixes -> Other?

That's fine, yes.

@phofl
Copy link
Member

phofl commented Sep 3, 2021

Shouldn't we backport is its a regression?

@jreback
Copy link
Contributor

jreback commented Sep 3, 2021

-0 on backporting. its no big deal to do, but i think this is a very minor case. @phofl do you feel strongly here?

@phofl
Copy link
Member

phofl commented Sep 3, 2021

No, just wanted to avoid that it gets missed

@jreback
Copy link
Contributor

jreback commented Sep 3, 2021

ok let's do it.

@aiudirog can you add a note to 1.3.3 regression section.

@jreback jreback modified the milestones: 1.4, 1.3.3 Sep 3, 2021
@aiudirog
Copy link
Contributor Author

aiudirog commented Sep 3, 2021

Will do, I'll take care of it later tonight.

@aiudirog
Copy link
Contributor Author

aiudirog commented Sep 4, 2021

Issue #42549 is related, but not the same as this issue: is_list_like also fails to detect __getitem__ implemented without __iter__. IMO solving that should require a different PR because it isn't a regression as isinstance(obj, abc.Iterable) doesn't detect those either, explicitly.

@jreback jreback merged commit baa1032 into pandas-dev:master Sep 4, 2021
@jreback
Copy link
Contributor

jreback commented Sep 4, 2021

thanks @aiudirog

@jreback
Copy link
Contributor

jreback commented Sep 4, 2021

@meeseeksdev backport 1.3.x

@lumberbot-app
Copy link

lumberbot-app bot commented Sep 4, 2021

Something went wrong ... Please have a look at my logs.

@aiudirog aiudirog deleted the fix-is-list-like branch September 5, 2021 01:44
simonjayhawkins pushed a commit that referenced this pull request Sep 5, 2021
feefladder pushed a commit to feefladder/pandas that referenced this pull request Sep 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Compat pandas objects compatability with Numpy or Python functions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants