Skip to content

The spec of to_device() method is missing #256

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
leofang opened this issue Sep 8, 2021 · 12 comments · Fixed by #259
Closed

The spec of to_device() method is missing #256

leofang opened this issue Sep 8, 2021 · 12 comments · Fixed by #259
Labels
bug Something isn't working. topic: Device Handling Device handling.
Milestone

Comments

@leofang
Copy link
Contributor

leofang commented Sep 8, 2021

Here we mentioned there's a to_device() method to enable data copy/movement across devices:

3. A `.to_device(device)` method on the array object, with `device` again being
a `Device` object, to move an array to a different device.

But we haven't defined the signature and semantics yet.

@leofang
Copy link
Contributor Author

leofang commented Sep 8, 2021

I think the signature should be as follows:

to_device(self, /, device, *, stream=None)

where device is a mandatory positional argument and stream is an optional kw-only argument, with the same semantics as in the DLPack support.

@rgommers rgommers added the bug Something isn't working. label Sep 9, 2021
@rgommers
Copy link
Member

rgommers commented Sep 9, 2021

Good catch, that is indeed missing.

where device is a mandatory positional argument

So you don't want it to be positional-only? That's possible, because it's a new method name so every library can choose the same name. However x.to_device(device='xxx') may be a little odd with the double "device".

@leofang
Copy link
Contributor Author

leofang commented Sep 13, 2021

Hi @rgommers

So you don't want it to be positional-only?

Ah yes, I meant it to be position-only, but I typed the signature wrong. It should read

to_device(self, device, /, *, stream=None)

@leofang
Copy link
Contributor Author

leofang commented Sep 13, 2021

My apology, @rgommers, I forgot that you already had a pending PR for this (#171). Should we merge it first, and then I'll make improvements (add the stream argument, etc) in #259?

@kgryte
Copy link
Contributor

kgryte commented Sep 13, 2021

@leofang Yeah, that would probably make the most sense. Based on #171, looks like it still may still need a couple of tweaks before merging.

@leofang
Copy link
Contributor Author

leofang commented Sep 14, 2021

Thanks, Athan, I just did!

@rgommers
Copy link
Member

Actually, let me ask a question about stream here: is using the DLPack-like integer the right choice here? I think libraries will either have no support for streams, or have custom objects. For DLPack there was little choice, because a custom class instance cannot work. However intra-library it could, similar to how Device itself works, or the dtype objects. Seems like that would be more natural.

@leofang
Copy link
Contributor Author

leofang commented Sep 15, 2021

That's a good point, Ralf. Thanks for brining it up. How about we make it a union, i.e. we accept both the DLPack-like integer and library-specific stream objects?

@rgommers
Copy link
Member

That makes sense to me. The "integer to stream" logic must be implemented anyway to support DLPack, so is not much of an extra burden. The actual union then becomes Union[int, _Stream] with _Stream: Any I guess?

It would also be good to write up here in a few sentences how often stream support in to_device is needed. I see PyTorch doesn't have it in .to (docs)- it does have a non_blocking keyword though, which I assume interacts with a stream context manager.

@BvB93
Copy link
Contributor

BvB93 commented Sep 16, 2021

That makes sense to me. The "integer to stream" logic must be implemented anyway to support DLPack, so is not much of an extra burden. The actual union then becomes Union[int, _Stream] with _Stream: Any I guess?

Technically a Union is not the correct type here, as (with protocol arguments being contravariant) that would imply concrete implementation must, at minimum, have support for all union members. Instead, with int- and _Stream-support being optional, you're in a similar situation to builtin protocols such as __abs__, wherein one of the types is a (somewhat constrained) free variable. Expressing the latter can be done with a typevar (e.g. ~T or ~StreamType), though arguably they don't look quite as nice as unions.

Examples

from __future__ import annotation
from typing import Union, Protocol, Any, TypeVar

_Stream = Any
_T_contra = TypeVar("_T_contra", bound=Union[None, int, _Stream], contravariant=True)

# Concrete implementations must support all union-members
class SupportsToDevice1(Protocol):
    to_device(self, /, device, *, stream: int | None | _Stream = None): ...

# Concrete implementations must support one or more of the typevar bounds
class SupportsToDevice2(Protocol[_T_contra]):
    to_device(self, /, device, *, stream: None | _T_contra = None): ...

@leofang
Copy link
Contributor Author

leofang commented Sep 16, 2021

It would also be good to write up here in a few sentences how often stream support in to_device is needed. I see PyTorch doesn't have it in .to (docs)- it does have a non_blocking keyword though, which I assume interacts with a stream context manager.

Yes I think so too, so PyTorch's .to(..., nonblocking, ...) is not much different from CuPy's asnumpy(..., stream, ...). They're all about synchronous/asynchronous executions, which is a subject we don't want to touch in the Array API as far as I understand.

So what I have in mind is taking one step back. We can say "if stream is specified, the copy is done on the given stream, otherwise it's done on the library's default stream" (whatever it is, even CPU libraries have an implicit "queue/stream"). Whether or not asynchronous execution is supported is up to the library.

@kgryte kgryte added this to the v2021 milestone Oct 4, 2021
@rgommers
Copy link
Member

Expressing the latter can be done with a typevar (e.g. ~T or ~StreamType), though arguably they don't look quite as nice as unions.

This makes perfect sense. It doesn't belong in the API standard docs though I think, that'd be pretty confusing to the reader. Maybe the right approach is to avoid using Union and just use text to say "or any ...".

For a library implementing this, the protocol approach makes sense. Right now we don't have a good place to put these type annotations though.

We can say "if stream is specified, the copy is done on the given stream, otherwise it's done on the library's default stream" (whatever it is, even CPU libraries have an implicit "queue/stream"). Whether or not asynchronous execution is supported is up to the library.

This sounds good.

@rgommers rgommers added the topic: Device Handling Device handling. label May 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working. topic: Device Handling Device handling.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants