|
2 | 2 |
|
3 | 3 | # Device support
|
4 | 4 |
|
5 |
| -TODO. See https://github.com/data-apis/array-api/issues/39 |
| 5 | +For libraries that support execution on more than a single hardware device - e.g. CPU and GPU, or multiple GPUs - it is important to be able to control on which device newly created arrays get placed and where execution happens. Attempting to be fully implicit doesn't scale to situation with multiple GPUs. |
6 | 6 |
|
| 7 | +Existing libraries employ one or more of these three methods to exert such control: |
| 8 | +1. A global default device, which may be fixed or user-switchable. |
| 9 | +2. A context manager to control device assignment within its scope. |
| 10 | +3. Local control via explicit keywords and a method to transfer arrays to another device. |
| 11 | + |
| 12 | +This standard chooses to add support for method 3 (local control), because it's the most explicit and granular, with its only downside being verbosity. A context manager may be added in the future - see {ref}`device-out-of-scope` for details. |
| 13 | + |
| 14 | +## Syntax for device assignment |
| 15 | + |
| 16 | +The array API will offer the following syntax for device assignment and cross-device data transfer: |
| 17 | + |
| 18 | +1. A string representation to identify a device: `'device_type:index'`, with |
| 19 | + `:index'` optional (e.g. doesn't apply to `'cpu'`). All lower-case, with type |
| 20 | + strings `'cpu'`, `'gpu'`, `'tpu'`. |
| 21 | +2. A `device` object, whose constructor takes the string representation, with properties: |
| 22 | + - `str`: the string representation. |
| 23 | + - `type`: the device type part of the string representation. |
| 24 | + - `index`: the device index, as an integer (with the first device of a given type having index `0`). |
| 25 | +3. A `device=None` keyword for array creation functions, which takes either the string representation or an instance of a `device` object. |
| 26 | +4. A `.to(device)` method on the array object, with `device` again being |
| 27 | + either a string or a device instance, to move arrays to a different device. |
| 28 | +5. A `.device` property on the array object, which returns a `device` object instance |
| 29 | + |
| 30 | + |
| 31 | +## Semantics |
| 32 | + |
| 33 | +- Operations involving one or more arrays on the same device must return arrays on that same device |
| 34 | +- Operations involving arrays on different devices must raise an exception |
| 35 | +- `device` object instances are only meant to be consumed by the library that produced them - the string attribute can be used for portability between libraries. |
| 36 | +- If a library encounters a device specification for an unknown or |
| 37 | + unsupported device, it must raise a `ValueError`. |
| 38 | +- There must be a default device, meaning all usages of `device=None` will produce arrays on the same device. |
| 39 | + |
| 40 | +```{note} |
| 41 | +The default device will typically be either `'cpu'` or `'gpu:0'`, however this |
| 42 | +is _not_ a requirement. Also note that the default device can vary based on |
| 43 | +available devices, e.g. `'gpu:0'` if available, `'cpu'` otherwise. Users should be |
| 44 | +aware of this, and consider using explicit device control if the default may |
| 45 | +not be right for them. |
| 46 | +``` |
| 47 | + |
| 48 | + |
| 49 | +(device-out-of-scope)= |
| 50 | + |
| 51 | +## Out of scope for device support |
| 52 | + |
| 53 | +Individual libraries may offers APIs for one or more of the following topics, |
| 54 | +however those are out of scope for this standard: |
| 55 | + |
| 56 | +- Setting a default device globally |
| 57 | +- Stream/queue control |
| 58 | +- Distributed allocation |
| 59 | +- Memory pinning |
| 60 | +- A context manager for device control |
| 61 | + |
| 62 | +```{note} |
| 63 | +A context manager for controlling the default device is present in most existing array |
| 64 | +libraries (NumPy being the exception). There are concerns with using a |
| 65 | +context manager however: |
| 66 | +
|
| 67 | +- TensorFlow has an issue where its `.shape` attribute is also a tensor, and |
| 68 | + that interacts badly with its context manager approach to specifying |
| 69 | + devices - because metadata like shape typically should live on the host, |
| 70 | + not on an accelerator. |
| 71 | +- A context manager can be tricky to use at a high level, since it may affect |
| 72 | + library code below function calls (non-local effects). See, e.g., [this |
| 73 | + PyTorch issue](https://github.com/pytorch/pytorch/issues/27878) for a |
| 74 | + discussion on a good context manager API. |
| 75 | +
|
| 76 | +Adding a context manager may be considered in a future version of this API standard. |
| 77 | +``` |
0 commit comments