-
Notifications
You must be signed in to change notification settings - Fork 4
Port indexing tests #23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
7f8bfc7
to
71554d7
Compare
f7b8c5f
to
b515b24
Compare
3d78a42
to
e8739ca
Compare
Ready for review! In terms of features, this PR predominantly implements advanced integer indexing... I still wouldn't be 100% confident in it's robustness as some other NumPy test files need importing which seem to cover it, but it currently does a lot well. Still, that was fairly simple to implement, and most edge cases tested in In terms of development workflow, as discussed with Evgeni I'll need to keep trying to minimise my PRs in the future. I had thought "work through an imported test file, and any second-order features I can cherry pick and split in a new PR" would be enough, but not really given how annoying this might be to review still 😅 I think the idea of importing tests and xfailing them wholesale in one PR, then slowly un-xfailing things when implementing a more atomic feature (maybe utilising strict-xfails more), makes sense. Below is a collapsable containing the diff of
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OH, can you add xfails to specific asserts?! The more you know!
Also, in PyTorch, and so we do in this project, we often use xfail
for things that we would like to work but we have not implemented them yet, and skip
for things that we now for sure that we're not going to implement. If in doubt, use xfail
, of course. For things that work in PyTorch but are asserted in the tests to throw an error, feel free to do whatever you want, as we don't care too much about these.
def __getitem__(self, index): | ||
index = _helpers.ndarrays_to_tensors(index) | ||
index = ndarray._upcast_int_indices(index) | ||
return ndarray._from_tensor_and_base(self._tensor.__getitem__(index), self) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yes indeed.
If the outcome of gh-47 is to keep the base, it'll need
base = self if result._base is self._tensor._base else None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's discuss this one in today's meeting
assert_(d[...].flags.writeable) | ||
assert_(d[0].flags.writeable) | ||
|
||
@pytest.mark.xfail(reason="can't determine f-style contiguous in torch") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be f_contiguous
not implemented yet.
@ev-br we should be able to implement this. It's fairly low prio tho.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, how does one implement this? For #49 I couldn't determine a way to infer Fortran-contiguous looking at torch.memory_format
at least (i.e. if you tried all formats for t.is_contiguous()
of the first U
output of torch.svd
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You'd need to look at the strides, i.e., x.stride()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Depending on where it's implemented, torch strides are element strides while numpy operates with byte strides, so https://github.com/Quansight-Labs/numpy_pytorch_interop/blob/main/torch_np/_ndarray.py#L106
Mhm pretty nifty. To clarify, these in-line xfails trigger "xfail-mode" after some expected-passing code has run... unfortunately no way to unxfail, so I did re-order the odd assertion heh.
Ayup makes sense, changed some xfails to skips which I think fit the bill of "not going to implement". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, this is great! Left several comments, all of which can be addressed eiher here or in a sequel, whichever is more convenient.
def __getitem__(self, index): | ||
index = _helpers.ndarrays_to_tensors(index) | ||
index = ndarray._upcast_int_indices(index) | ||
return ndarray._from_tensor_and_base(self._tensor.__getitem__(index), self) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh yes indeed.
If the outcome of gh-47 is to keep the base, it'll need
base = self if result._base is self._tensor._base else None
torch_np/_ndarray.py
Outdated
torch.int64, | ||
torch.uint8, | ||
]: | ||
value = value.to(self.dtype.torch_dtype) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So pytorch would not do it for us under the hood?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually it does:
>>> a = torch.randn(3)
>>> b = torch.tensor([1,2])
>>> a[0:2] = b
>>> a
tensor([1.0000, 2.0000, 1.8933])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I seemed to of added this without the proper consideration 😅 It looks like I added this for test_index_is_larger
def test_index_is_larger(self):
a = np.zeros((5, 5))
a[[[0], [1], [2]], [0, 1, 2]] = [2, 3, 4]
...
which in torch terms would previously go like
>>> t = torch.zeros((5, 5))
>>> value = torch.as_tensor([2, 3, 4])
>>> t[[[0], [1], [2]], [0, 1, 2]] = value
RuntimeError: Index put requires the source and destination dtypes match, got Float for the destination and Long for the source.
You see value
above is converted to a tensor because of this conversion
numpy_pytorch_interop/torch_np/_ndarray.py
Line 391 in 4c5eb7d
value = asarray(value).get() |
My change here additionally converts value
to a float tensor, which doesn't get the above RuntimeError
.
(Just flying-by to documenting this, will have a proper think on this later.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed this internal casting. xfail'd test_index_is_larger
as its finnicky to implement these differently sized inputs/outputs put-like operations—something to look at when we implement tnp.put
I'd imagine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine. Can you leave a comment though?
assert_(d[...].flags.writeable) | ||
assert_(d[0].flags.writeable) | ||
|
||
@pytest.mark.xfail(reason="can't determine f-style contiguous in torch") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Depending on where it's implemented, torch strides are element strides while numpy operates with byte strides, so https://github.com/Quansight-Labs/numpy_pytorch_interop/blob/main/torch_np/_ndarray.py#L106
Also allows for `IndexError` in `test_index_no_floats`
Effort to test thoroughly, so trusting in NumPy's own tests down-the-line
More of an exploratory test than anything, so low-prio to figure out the flakiness issue
Slightly more optimised
So overall, how about merging it in the current stage and then iterate on finer details? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, just leave a comment in #23 (comment) and feel free to merge.
Well I reverted the change. The failing test I've xfailed with the reason (In terms of the use of |
fair enough. |
Let's merge to keep the ball rolling. Thank you @honno ! |
WIP, no need for review yet. Currently going through
core/tests/test_indexing.py
, might include (much smaller) test files that also cover indexing.