-
Notifications
You must be signed in to change notification settings - Fork 4
Blogpost for the PyTorch blog #174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Mario, great post.
blogpost/post.md
Outdated
considerable speed-ups. | ||
|
||
This can of course be combined with `torch.compile` to be able to compile | ||
programs that rely on these other libraries. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not quiter for SciPy, since it's also encompassing compiled code without matching functionality in PyTorch.
blogpost/post.md
Outdated
|
||
From Quansight, we would like to thank Meta for funding this project and all | ||
the previous work that lead to it, like improving the NumPy compatibility | ||
within PyTorch, and developing the [python Array API](https://data-apis.org/array-api/latest/). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This didn't involve any Meta funding actually. Quansight and Intel have been the largest funders, the others are listed on https://data-apis.org/. Meta did fund the work on adding PyTorch support to https://github.com/data-apis/array-api-compat, but that work and funding acknowledgement are covered in Thomas' blog post on the scikit-learn work.
Addressed the review. The main missing point would be that I start talking about QS in the penultimate section. I would like to mention QS somewhere at the top as well, but I really like the introductory paragraph, and I don't know how. I've added a proposal in the last commit, but I don't fully like it. Could I get some feedback particularly on that? |
In particular, could you guys give further feedback on the introductory paragraph and the last 2 sections? I'd like to make sure those are proper tight. |
blogpost/post.md
Outdated
The compiled function yields a 9x speed-up when running it on 1 core. Even | ||
better, since the compiled code also runs on multiple cores, we get a **57x speed-up** | ||
when running it on 32 cores. Note that vanilla NumPy always runs on | ||
just one core. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you compared the performance to other options for compiling numpy. The get_labels
example contains functions which are supported by numba
, that might be an interesting datapoint to see how the torch.compile
speedup compares to numba.jit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really, as this would just make our announcement post way too long. I will expect other people to put up posts comparing this and other approaches (julia / numba / torch.jit / mojo), but that's beyond the scope of this post IMO
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I plan to write a follow-up, based on #168 or https://github.com/Quansight-Labs/numpy_pytorch_interop/tree/main/e2e/smoke, comparing to numba.jit may fit there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That'd be super cool!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Weirdly enough, throwing @numba.njit
in place of torch.compile
in this example runs into several rough edges of numba:
np.linalg.norm(..., axis=2)
does not compile, chokes onaxis
.- Replacing
np.linalg.norm
with
def norm(a, axis):
s = (a.conj() * a).real
return np.sqrt(s.sum(axis=axis))
and njit
-ting both norm
and get_labels
gives a slowdown of about 60%.
-
trying
@njit(parallel=True)
onget_labels
crashes the compiler, again on theaxis
argument (TypeError: Failed in nopython mode pipeline (step: Preprocessing for parfors) got an unexpected keyword argument 'axis'
) -
trying
@njit(parallel=True)
on thenorm
only and@njit
onget_labels
yields a slowdown w.r.t. numpy of 'only' 20%.
Got to admit I never had much luck with numba in non-toy situations.
replacement for the PyTorch API, as it is **much slower** and, as a private API, | ||
**may change without notice**. | ||
|
||
## Differences between NumPy and `torch.compile`d NumPy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May want to link to #5 (or move it the list in that issue somewhere more appropriate)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's on my TODO list for next week!
Reads great! My main confusion is around the section about mixing numpy arrays and torch tensors. Might be great to also stress a recommendation for mixing the two: does torch.compile gracefully handle the mix, what is the recommendation (do not mix, mix freely or something else). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Merging just to archive it. The last version of this post is at https://hackmd.io/@82vuWEfETza6QgoTVJYnew/SkGEpt_R2/edit |
No description provided.