Skip to content

Handle CoW in BlockManager.apply #50948

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

jorisvandenbossche
Copy link
Member

@phofl seeing that we need the Block methods be aware of CoW (does it make a copy or not? do we need to keep track of a ref or not?) in several cases, I was thinking of this alternative of how to handle this generally. I don't know if it will turn out to be cleaner if has to handle all different cases, but for now I tested it with astype and infer_objects (#50802, #50428).

The idea is that the Block method just tells the manager whether the block that is returned is a view or a copy, and then all the weakref management is centralized in the BlockManager.apply() method.
It avoids the "hack" of attaching the _ref to the block. Now I am writing this, of course also the in case of using the _ref method, the extracting of those refs could be centralized in apply() instead of done in both astype and convert, as it is the case right now in the linked PRs.
I think I find the logic a bit simpler here in this PR, especially for convert with its block-splitting logic, but it's also not a big difference (and I am of course biased since I wrote this one, and the logic in your PRs I only read ;))

Note, I just copied some minimal pieces from your PRs to test this, and didn't bother with typing and cleaning it up etc. If we think this might be a suitable path forward, I would also remove those parts here again, and only add the general infrastructure here, and keep astype/convert for its separate PRs.

@phofl
Copy link
Member

phofl commented Jan 24, 2023

I agree that we have to centralise some things (which is something that is on my TODO list as well). I thought about this solution as well, but decided to go with the specialised things on the block level to be a bit more flexible for now. I'd prefer having a go at centralizing this in apply after implementing CoW for most methods that need this on the block level. Right now, it's still not totally clear to me if this is a good idea, replace for example is pretty convoluted and could be a hassle when passing this back. On the other side you are totally correct that the split block thing is annoying as hell here, what we could do though is handle this within the decorator?

But we should definitely add a central method on the manager level to unpack _ref and a central method to populate it on the block level.

@jorisvandenbossche
Copy link
Member Author

With #51144, this is now much cleaner and this can be closed.

@jorisvandenbossche jorisvandenbossche deleted the cow-manager-apply branch February 8, 2023 16:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants