Skip to content

BUG make take_nd support readonly arrays #10070

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

ogrisel
Copy link
Contributor

@ogrisel ogrisel commented May 6, 2015

This is a tentative fix for #10043. I have not run any benchmark yet. I think we should do that before considering merging this PR. Any recommended way to do it?

@jreback jreback added the Indexing Related to indexing on series/frames, not to indexes themselves label May 6, 2015
@jreback
Copy link
Contributor

jreback commented May 6, 2015

@ogrisel perf looks fine. I ran a full benchmark and nothing out of line.

@jreback
Copy link
Contributor

jreback commented May 6, 2015

normally I wouldn't suggest this as its not a good idea to modify external state, but can this be done in a context manager? e.g.

with context:
    arr.flags.write=True
    result = take_.......(arr)
    arr.flags.write=False
    return result

@ogrisel
Copy link
Contributor Author

ogrisel commented May 6, 2015

@jreback I am not sure I understand. Do you mean for the tests or for the implementation?

For the implementation it would not work:

  • it's not thread safe
  • some arrays cannot be made writeable, for instance readonly np.memmap instances.

@jreback
Copy link
Contributor

jreback commented May 6, 2015

@ogrisel I know its not thread-safe, though to be fair most of pandas is not.
your 2nd point I didn't know.

@ogrisel
Copy link
Contributor Author

ogrisel commented May 6, 2015

BTW I confirm that this branch fixes the original issue reported in #10043.

I agree that the templating code is not very easy to follow though.

@ogrisel ogrisel changed the title BUG make take_nd support readonly array BUG make take_nd support readonly arrays May 6, 2015
@jreback jreback added this to the 0.16.1 milestone May 6, 2015
@jreback
Copy link
Contributor

jreback commented May 6, 2015

@ogrisel ok, your soln looks good. I just had to make a minor correction (the indexer was being passed as a ndarray into the memview function (I think maybe from the original code), and was showing some cython warnings. Making it a memory view worked as well. And added some more tests.

As soon as this passes will merge. We are going to release 0.16.1 soon, so this will make it in.

https://github.com/jreback/pandas/tree/ogrisel-readonly-take_nd

thanks!

FYI I tried to change this to general 2 separate cython function and disambiguate at a higher level, but then I had to fix more stuff....so this is a nice work-around.

@ogrisel
Copy link
Contributor Author

ogrisel commented May 7, 2015

Great! But before merging you might want to consider @richardhansen's comment at #10043 (comment). We should check whether this alternative solution can work on read-only np.memmap instances.

FYI I tried to change this to general 2 separate cython function and disambiguate at a higher level, but then I had to fix more stuff....so this is a nice work-around.

I tried that as well initially but rolled back my changes why I saw the amount of code to edit on the Python side.

@ogrisel
Copy link
Contributor Author

ogrisel commented May 7, 2015

Actually I find @richardhansen's solution not necessarily simpler to understand than my templating hack. Let me know what you think.

@jreback
Copy link
Contributor

jreback commented May 7, 2015

@ogrisel so options are to defer this to next version to investigate
cc @richardhansen comment a bit more, or to merge this soln (which works, is performant, adds a bit to code complexity though)

@artemyk
Copy link
Contributor

artemyk commented May 7, 2015

One thing to consider --- if the original memview indexing code was put in to increase indexing performance, @richardhandsen's solution (which, if I understand correctly, would permit using memviews in all cases) might also offer better indexing performance on read-only arrays .

@jreback
Copy link
Contributor

jreback commented May 7, 2015

@artemyk

do you want to experiment with @richardhandsen soln?

It might be simpler (e.g. less code-generation). I don't think perf is really a big deal on read-only arrays. They are not found too much in the wild (as evidenced that this issue is just coming up but this memview changes was made several years ago).

@artemyk
Copy link
Contributor

artemyk commented May 7, 2015

@jreback Probably won't have time in the next few days, though next week I can try to take a look. Would not hold back merging into 0.16.1 though (unless someone wants to play w. it sooner)

@jreback
Copy link
Contributor

jreback commented May 7, 2015

@artemyk ok gr8. Will merge this, can always revert if a better soln is found.

@jreback
Copy link
Contributor

jreback commented May 8, 2015

merged via be2a9f8

thanks!

@jreback jreback closed this May 8, 2015
@amueller
Copy link

amueller commented May 8, 2015

sweet!

@ogrisel
Copy link
Contributor Author

ogrisel commented May 11, 2015

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants