Skip to content

Make ones/zeros/empty/full dtype handling more uniform #51

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Feb 16, 2023

Conversation

ev-br
Copy link
Collaborator

@ev-br ev-br commented Feb 9, 2023

while at it, un-xfail test_numeric.py::TestTypes.

@ev-br ev-br force-pushed the infer-full-dtype-followup branch from 74cdad2 to 47e8adf Compare February 9, 2023 20:17
@ev-br
Copy link
Collaborator Author

ev-br commented Feb 9, 2023

@honno what to you think this further tweak to full?

Copy link
Contributor

@honno honno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally everything looks good.

what to you think this further tweak to full?

LGTM! Updated my exploratory test for it, didn't find any issues. Everything else I suppose we expect to get covered anyway when porting tests, if they're not already.

@ev-br ev-br force-pushed the infer-full-dtype-followup branch from 8f1b501 to 89a8d54 Compare February 13, 2023 17:14
@ev-br ev-br requested a review from honno February 13, 2023 18:46
@ev-br
Copy link
Collaborator Author

ev-br commented Feb 13, 2023

1b41e14 implements dtype.kind symbol following numpy : it's basically the first character of a dtype category ('f' for all floats, 'c' for complex variants, 'b' for bool , 'i' for integers)

@honno
Copy link
Contributor

honno commented Feb 15, 2023

Niche behaviour I found in NumPy

>>> np.zeros((1, 5), dtype="c")  # character dtype special case
array([[b'', b'', b'', b'', b'']], dtype='|S1')

Doubt we care to implement this, just wondering if it would be nice to check for this and raise NotImplementedError, i.e. right now this PR gives

TypeError: data type 'c' not understood

@ev-br
Copy link
Collaborator Author

ev-br commented Feb 15, 2023

ISTM this is the best we can do with non-numeric dtypes, which are not supported by pytorch. Do you think NotImplementedError is more descriptive than TypeError?

@lezcano
Copy link
Collaborator

lezcano commented Feb 15, 2023

IMO TypeError here's good, as we don't plan to implement these.

Copy link
Collaborator

@lezcano lezcano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a comment that's important all across the codebase, but otherwise the PR LGTM.

Feel free to fix that point in a future PR if you prefer.

@@ -288,12 +288,15 @@ def arange(start=None, stop=None, step=1, dtype=None, *, like=None):
raise ValueError("Maximum allowed size exceeded")


@_decorators.dtype_to_torch
def empty(shape, dtype=float, order="C", *, like=None):
_util.subok_not_ok(like)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a bit of a mismatch here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What mismatch?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't we have a function that's _util.like_not_implemented(like) or so? Why are we passing like to subok?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, so it's the util name which is confusing. Conceptually, I'd say that like and subok are similar: one is specifically for ndarray subclasses, the other one is for a newer __array_wrap__ protocol (I think). Most functions which work for subclasses or the protocol, have both arguments, some have one or the other. We are not going to support either of them. The utility is literally two lines:
https://github.com/Quansight-Labs/numpy_pytorch_interop/blob/main/torch_np/_detail/_util.py#L22

So if we go with utility functions I'd keep it as is, can rename it if really wanted. If we transform to the normalization approach of gh-32, these two would likely get their own type annotations.

So overall I'd rather not touch it now.

Comment on lines +322 to +324
fill_value = asarray(fill_value).get()
if dtype is None:
torch_dtype = asarray(fill_value).get().dtype
else:
torch_dtype = _dtypes.torch_dtype_from(dtype)
return asarray(torch.full(shape, fill_value, dtype=torch_dtype))
dtype = fill_value.dtype
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A better way to implement this would be having a function that given a tensor or a number returns its torch dtype. The reason here is that we don't want to wrap scalars into tensors if we don't need to. Scalars are easier to treat in a compiler than tensors, and often get extra optimisations (no memory allocations, no synchronisations when accessed if we are working on CUDA, you can specialise on them...)

Note that this applies in general, so we should be mindful about this going all across.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's move this to #56, to not conflate a small cleanup of this PR with a rather large rework.

@ev-br ev-br merged commit c063e33 into main Feb 16, 2023
@ev-br ev-br deleted the infer-full-dtype-followup branch February 16, 2023 10:16
@ev-br
Copy link
Collaborator Author

ev-br commented Feb 16, 2023

Merging to keep the ball rolling. Comments ask for a codebase wide tweaks, so can be addressed in a sequel.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants