Skip to content

Commit 2784564

Browse files
committed
Add section on copy-view behaviour and mutability
Closes gh-24
1 parent 4ffb46c commit 2784564

File tree

3 files changed

+74
-2
lines changed

3 files changed

+74
-2
lines changed
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,73 @@
1+
.. _copyview-mutability:
2+
3+
# Copy-view behaviour and mutability
4+
5+
Strided array implementations (e.g. NumPy, PyTorch, CuPy, MXNet) typically
6+
have the concept of a "view", meaning an array containing data in memory that
7+
belongs to another array (i.e. a different "view" on the original data).
8+
Views are useful for performance reasons - not copying data to a new location
9+
saves memory and is faster than copying - but can also affect the semantics
10+
of code. This happens when views are combined with _mutating_ operations.
11+
This simple example illustrates that:
12+
13+
```python
14+
x = ones(1)
15+
x += 2
16+
y = x # `y` *may* be a view
17+
y -= 1 # if `y` is a view, this modifies `x`
18+
```
19+
20+
Code as simple as the above example will not be portable between array
21+
libraries - for NumPy/PyTorch/CuPy/MXNet `x` will contain the value `2`,
22+
while for TensorFlow/JAX/Dask it will contain the value `3`. The combination
23+
of views and mutability is fundamentally problematic here if the goal is to
24+
be able to write code with unambiguous semantics.
25+
26+
Views are necessary for getting good performance out of the current strided
27+
array libraries. It is not always clear however when a library will return a
28+
view, and when it will return a copy. This API standard does not attempt to
29+
specify this - libraries can do either.
30+
31+
There are several types of operations that do in-place mutation of data
32+
contained in arrays. These include:
33+
34+
1. Inplace operators (e.g. `*=`)
35+
2. Item assignment (e.g. `x[0] = 1`)
36+
3. Slice assignment (e.g., `x[:2, :] = 3`)
37+
4. The `out=` keyword present in some strided array libraries (e.g. `sin(x, out=y`))
38+
39+
Libraries like TensorFlow and JAX tend to support inplace operators, provide
40+
alternative syntax for item and slice assignment (e.g. an `update_index`
41+
function or `x.at[idx].set(y)`), and have no need for `out=`.
42+
43+
A potential solution could be to make views read-only, or use copy-on-write
44+
semantics. Both are hard to implement and would present significant issues
45+
for backwards compatibility for current strided array libraries. Read-only
46+
views would also not be a full solution, given that mutating the original
47+
(base) array will also result in ambiguous semantics. Hence this API standard
48+
does not attempt to go down this route.
49+
50+
Both inplace operators and item/slice assignment can be mapped onto
51+
equivalent functional expressions (e.g. `x[idx] = val` maps to
52+
`x.at[idx].set(val)`), and given that both inplace operators and item/slice
53+
assignment are very widely used in both library and end user code, this
54+
standard chooses to include them.
55+
56+
The situation with `out=` is slightly different - it's less heavily used, and
57+
easier to avoid. It's also not an optimal API, because it mixes an
58+
"efficiency of implementation" consideration ("you're allowed to do this
59+
inplace") with the semantics of a function ("the output _must_ be placed into
60+
this array). There's alternatives, for example the donated arguments in JAX
61+
or working buffers in LAPACK, that allow the user to express "you _may_
62+
overwrite this data, do whatever is fastest". Given that those alternatives
63+
aren't widely used in array libraries today, this API standard chooses to (a)
64+
leave out `out=`, and (b) not specify another method of reusing arrays that
65+
are no longer needed as buffers.
66+
67+
This leaves the problem of the initial example - with this API standard it
68+
remains possible to write code that will not work the same for all array
69+
libraries. This is something that the user must be careful about.
70+
71+
.. note::
72+
73+
It is recommended that users avoid any mutating operations when a view may be involved."

spec/design_topics/eager_lazy_eval.md

-1
This file was deleted.

spec/design_topics/index.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ Design topics & constraints
66
:maxdepth: 1
77

88
backwards_compatibility
9-
eager_lazy_eval
9+
copies_views_and_mutation
1010
parallelism
1111
static_typing
1212
array_ducktyping

0 commit comments

Comments
 (0)