BUG: make array-detection logic in array(...) more robust #138

ev-br · 2023-05-13T09:40:47Z

Needed for arrays hiding as elements of nested lists e.g. np.array([[1, 2], [3, np.array(4)]]).

Split off from gh-137.

This is needed for arrays hiding as elements of nested lists: e.g. asarray([[1, 2], [3, np.array(4)]])

lezcano · 2023-05-13T16:27:14Z

torch_np/_ndarray.py

@@ -449,12 +449,28 @@ def __dlpack_device__(self):
        return self.tensor.__dlpack_device__()


+def _tolist(obj):
+    """Recusrively convert tensors into lists."""
+    a1 = []


Bring here the if obj is isntance list/tuple, otherwise this function assumes that obj is an iterable, whcih is a non-trivial thing to assume in this context.

It's guarded with an if isinstance check at the call site?
https://github.com/Quansight-Labs/numpy_pytorch_interop/pull/138/files#diff-83812bd01681b687b8fced4eac464b076aecffc471e0364d905b38fba8794e6cR487

Yeah. What I'm saying is that the logic of this function assumes that if implicitly. The if instance should be inside this function, not in the caller site..

lezcano · 2023-05-13T16:31:03Z

torch_np/_ndarray.py

    if isinstance(obj, (list, tuple)):
-        a1 = []
-        for elem in obj:
-            if isinstance(elem, ndarray):
-                a1.append(elem.tensor.tolist())
-            else:
-                a1.append(elem)
-        obj = a1
+        obj = _tolist(obj)


Also, in what context do we need this? as_tensor already seems to do this:

>>>torch.as_tensor([[2,3], np.array([2,3])]) tensor([[2, 3], [2, 3]])

In main, it only looks into the first level of nesting:

In [3]: tnp.array([[1, 2], [tnp.int8(1), 2]]) --------------------------------------------------------------------------- IndexError Traceback (most recent call last) Cell In[3], line 1 ----> 1 tnp.array([[1, 2], [tnp.int8(1), 2]]) File ~/sweethome/proj/scipy/torch_np_compat/torch_np/_ndarray.py:495, in array(obj, dtype, copy, order, subok, ndmin, like) 493 def array(obj, dtype=None, *, copy=True, order="K", subok=False, ndmin=0, like=None): 494 # The result of the public `np.array(obj)` is not weakly typed. --> 495 return _array(obj, dtype, copy=copy, order=order, subok=subok, ndmin=ndmin, like=like, is_weak=False) File ~/sweethome/proj/scipy/torch_np_compat/torch_np/_ndarray.py:489, in _array(obj, dtype, copy, order, subok, ndmin, like, is_weak) 486 if dtype is not None: 487 torch_dtype = _dtypes.dtype(dtype).torch_dtype --> 489 tensor = _util._coerce_to_tensor(obj, torch_dtype, copy, ndmin, is_weak) 490 return ndarray(tensor) File ~/sweethome/proj/scipy/torch_np_compat/torch_np/_util.py:213, in _coerce_to_tensor(obj, dtype, copy, ndmin, is_weak) 211 tensor = torch.as_tensor(obj, dtype=dtype) 212 else: --> 213 tensor = torch.as_tensor(obj) 215 # tensor.dtype is the pytorch default, typically float32. If obj's elements 216 # are not exactly representable in float32, we've lost precision: 217 # >>> torch.as_tensor(1e12).item() - 1e12 (...) 220 # Therefore, we treat `tensor.dtype` as a hint, and convert the 221 # original object *again*, this time with an explicit dtype. 222 torch_dtype = _dtypes_impl.get_default_dtype_for(tensor.dtype) File ~/sweethome/proj/scipy/torch_np_compat/torch_np/_ndarray.py:391, in ndarray.__len__(self) 390 def __len__(self): --> 391 return self.tensor.shape[0] IndexError: tuple index out of range

A smoke test is https://github.com/Quansight-Labs/numpy_pytorch_interop/pull/138/files#diff-66e581e6a3a373197465b30dd84684d915c83f3c619773382e8f8a40140662a5R558

EDIT: the traceback shows it's not from a branch not main, but the point stands.

But I think this works in PyTorch?

>>> torch.as_tensor([[1,2],[np.int8(1), 2]]) tensor([[1, 2], [1, 2]])

With numpy ndarrays, yes. With torch_np wrapper ndarrays, not in main, yes with this PR.

Then why calling tolist()? Shouldn't we just unwrap the tensor if it's a tnp array and call item() if it's of weak dtype?

nvm, I just read #138 (comment). Can you add a comment explaining this point?

ev-br · 2023-05-14T18:09:20Z

Note that this function is almost identical to _helpers/ndarrays_to_tensors, which is used for fancy indexing and basically converts nested arrays to tensors. Letting go of .tolist() here sounded tempting but does not quite work because numpy allows, for instance,

(Pdb) import numpy as _np
(Pdb) p x
array([[1., 2.],
       [3., 4.]], dtype=float32)
(Pdb) _np.asarray([x, 2*x, 3*x])
array([[[ 1.,  2.],
        [ 3.,  4.]],

       [[ 2.,  4.],
        [ 6.,  8.]],

       [[ 3.,  6.],
        [ 9., 12.]]], dtype=float32)

(Pdb) import torch
(Pdb) t
tensor([[1., 2.],
        [3., 4.]])
(Pdb) torch.as_tensor([t, 2*t, 3*t])
*** ValueError: only one element tensors can be converted to Python scalars

lezcano

Just missing a couple nits discussed in the comments, but otherwise LGTM.

lezcano · 2023-05-15T10:55:37Z

torch_np/_ndarray.py

@@ -449,12 +449,28 @@ def __dlpack_device__(self):
        return self.tensor.__dlpack_device__()


+def _tolist(obj):
+    """Recusrively convert tensors into lists."""


nit Recursively

lezcano · 2023-05-17T15:45:57Z

As discussed offline, this may be not super efficient, but we don't care about tracing efficiency atm, so in it goes.

ev-br added 2 commits May 13, 2023 12:37

BUG: make array-detection logic in array(...) more robust

948fd6f

This is needed for arrays hiding as elements of nested lists: e.g. asarray([[1, 2], [3, np.array(4)]])

MAINT: remove last vestiges of subok_not_ok

20ca6e1

ev-br requested a review from lezcano May 13, 2023 09:40

ev-br mentioned this pull request May 13, 2023

NEP 50 scalars, weakly typed #137

Closed

lezcano reviewed May 13, 2023

View reviewed changes

lezcano approved these changes May 15, 2023

View reviewed changes

lezcano merged commit 133c367 into main May 17, 2023

ev-br deleted the array_recurse branch May 17, 2023 15:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

BUG: make array-detection logic in array(...) more robust #138

BUG: make array-detection logic in array(...) more robust #138

Uh oh!

ev-br commented May 13, 2023

Uh oh!

lezcano May 13, 2023

Uh oh!

ev-br May 13, 2023

Uh oh!

lezcano May 15, 2023

Uh oh!

lezcano May 13, 2023

Uh oh!

ev-br May 13, 2023 •

edited

Loading

Uh oh!

lezcano May 14, 2023

Uh oh!

ev-br May 14, 2023 •

edited

Loading

Uh oh!

lezcano May 15, 2023

Uh oh!

lezcano May 15, 2023

Uh oh!

ev-br commented May 14, 2023 •

edited

Loading

Uh oh!

lezcano left a comment

Uh oh!

lezcano May 15, 2023

Uh oh!

lezcano commented May 17, 2023

Uh oh!

Uh oh!

BUG: make array-detection logic in array(...) more robust #138

BUG: make array-detection logic in array(...) more robust #138

Uh oh!

Conversation

ev-br commented May 13, 2023

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ev-br May 13, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ev-br May 14, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ev-br commented May 14, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lezcano left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lezcano commented May 17, 2023

Uh oh!

Uh oh!

ev-br May 13, 2023 •

edited

Loading

ev-br May 14, 2023 •

edited

Loading

ev-br commented May 14, 2023 •

edited

Loading