-
Notifications
You must be signed in to change notification settings - Fork 24.3k
Prefer construction via DLPack to costly element-by-element copy #120615
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/120615
Note: Links to docs will display an error until the docs builds have been completed. ✅ You can merge normally! (2 Unrelated Failures)As of commit 3b48f6a with merge base b381a43 ( FLAKY - The following job failed but was likely due to flakiness present on trunk:
BROKEN TRUNK - The following job failed but was present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
The committers listed above are authorized under a signed CLA. |
Does |
@vadimkantorov
|
I'm not one of maintainers/reviewers, so we'll have to see what they got to say :) I just also stumbled on this similar issue of |
Intererestingly, there is some support for |
For reference, who is implementing |
This would be great to fix, thanks @wjakob.
+1 to adding this to
In what is probably the most common case (no explicit Another important difference is dtype handling. The dtypes supported by CAI are numpy type strings, while DLPack uses these enums. Where there are differences, DLPack dtypes are more useful to PyTorch I'd think. The extra numpy ones (strings, datetimes, DLPack is also more general w.r.t. other devices, and being actively evolved. E.g., it just gained a read-only flag, which may become relevant if PyTorch gains read-only/copy-on-write tensors for example. So I'd have a preference for trying DLPack before CAI. The counter-argument I can think of may be stability. DLPack just tagged 1.0rc1. 1.0 is ABI-breaking; there's a way to query the version and handle both 0.x and 1.0 appropriately, but that should then of course be supported in PyTorch within 12-18 months or so after it comes out, otherwise there may be a situation in the future where other libraries drop 0.x support while PyTorch doesn't yet support 1.x. |
@ezyang, @rgommers: Regarding the flaws of
(Yikes.) DLPack basically clarifies the ownership story and makes that part accessible to non-python consumers/producers that don't have a way of |
Ah yes, lifetime management is a worse issue than the ones I mentioned. Then I'd definitely put DLPack first. |
Also, DLPack support in PyTorch maybe still has this bug: #43166 The variant of supporting full-user-in-charge lifetime management (or passing an extra user-provided deleter) is also important - for basic interop scenarios like #34646 Also, if conflicts between several interfaces might arise, is it possible to somehow select the wanted protocol? e.g. by |
@vadimkantorov Actually, I think I had reported the same issue separately (was unaware of this ticket) as #117273, which has been fixed by in the meantime. |
Needs test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just waiting for test
Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as |
@wjakob Are you still working on this? If not, is it ok if I take over this PR? |
@ysiraichi please go ahead, I ran out of time to finish this. |
#138697) Fixes #120614 Takes over #120615 In summary, this PR: - Adds a `__dlpack__` attribute check in the tensor creation path (i.e. [`internal_new_from_data` @ tensor_new.cpp](https://github.com/pytorch/pytorch/blob/cdfe1bffd16bdd28adbe5518038f68e6ac45de8d/torch/csrc/utils/tensor_new.cpp#L266)) - Creates the tensor by using the DLPack machinery, instead of an element-by-element copy - No changes since #120615 - Adds a test, making sure the DLPack machinery is used - Wraps a tensor in a fresh `TensorDLPackWrapper` class that implements only the DLPack methods - Creates a new tensor from an instance of `TensorDLPackWrapper` Pull Request resolved: #138697 Approved by: https://github.com/ezyang Co-authored-by: Wenzel Jakob <[email protected]>
Fixes #120614
Copy of pitch:
Consider the following statement
where
other_array
is an array instance constructed by another array programming framework. If this other framework is principally designed around the DLPack array exchange protocol, something very bad happens.Basically PyTorch ignores the DLPack interface (
other_array.__dlpack__
) altogether. Furthermore, ifother_array
exposes the sequence protocol (__len__
and__getitem__
), PyTorch will perform a brute-force element-wise copy potentially requiring hundreds of millions of separate PCI-express transactions to copy individual floating point values from the GPU. To the user, it will seem that the application has crashed because nothing happens and attempting to interrupt the Python kernel doesn't work since this is all done on the C++ side.Fortunately PyTorch supports the DLPack protocol to efficiently exchange tensors with other libraries. But it only does so when the user creates the arrays in a sort of awkward way:
It's easy to forget to do this, with extremely unpleasant results. It would be very easy for PyTorch to check if
other_array
implements the DLPack protocol and then simply to switch to this construction method. I will create a separate PR proposing a prototype of this idea.