Skip to content

Use take_1d in take_nd #40405

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

jorisvandenbossche
Copy link
Member

@jbrockmendel
Copy link
Member

jbrockmendel commented Mar 15, 2021

can remove comment in take_1d about being only for ArrayManager

Update: ill just do this in one of my cleanup branches

@jbrockmendel
Copy link
Member

test failure is unrelated (and the flaky test is fixed on master). LGTM

@jreback
Copy link
Contributor

jreback commented Mar 16, 2021

cool, does this allow refactor of other code?

@jbrockmendel
Copy link
Member

can this be moved out of Draft mode?

@jbrockmendel
Copy link
Member

@jorisvandenbossche gentle ping

@jbrockmendel
Copy link
Member

@jorisvandenbossche gentle ping to see if we can move this out of draft mode

@jbrockmendel
Copy link
Member

@jorisvandenbossche gentle ping

@jorisvandenbossche
Copy link
Member Author

Updated the PR.
But personally not sure if it is worth it, though:

In [1]: from pandas.core.array_algos.take import take_1d, take_nd

In [2]: dtype = 'int'
   ...: m = 100
   ...: n = 1000

In [3]: values = np.arange(m * n)

In [4]: index = np.arange(m)

# this is not affected, just including as reference
In [5]: %timeit take_1d(values, index)
5.67 µs ± 27 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)  <-- master
5.59 µs ± 50.6 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)  <-- PR

In [6]: %timeit take_nd(values, index)
10.5 µs ± 104 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)  <-- master
8.86 µs ± 68.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)  <-- PR

So it does give a small improvement, but when increasing the size of the array, the improvement almost completely disappears.

@jbrockmendel
Copy link
Member

3 lines for a 15% improvement for a fairly common case seems like a good deal to me

@@ -89,6 +89,10 @@ def take_nd(
if fill_value is lib.no_default:
fill_value = na_value_for_dtype(arr.dtype, compat=False)

if arr.ndim == 1 and axis == 0 and indexer is not None:
indexer = ensure_platform_int(indexer)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment to the effect of "fastpath"

@jreback
Copy link
Contributor

jreback commented Apr 22, 2021

agree this can't hurt (and you know me about not liking special cases)

@github-actions
Copy link
Contributor

This pull request is stale because it has been open for thirty days with no activity. Please update or respond to this comment if you're still interested in working on this.

@github-actions github-actions bot added the Stale label May 24, 2021
@simonjayhawkins
Copy link
Member

Thanks @jorisvandenbossche for the PR. closing as stale. re-open when ready.

@jorisvandenbossche jorisvandenbossche deleted the take_nd-dispatch-1d branch November 11, 2021 14:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Refactor Internal refactoring of code Stale
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants