-
Notifications
You must be signed in to change notification settings - Fork 129
ENH: Implement COLA library rewrites for linear algebra functions #573
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
I think the easiest short term path is to add a feature (like the shape feature) that For instance when you replace a variable by another you can assume what you inferred for the old variable holds for the new. When a new variable is added you can compute new stuff. Like x * 5, where x is positive/diagonal is also positive/diagonal. If it's positive it's not negative and so on. Some type info needs to be seeded by the user initially, as they won't emerge naturally from realistic graphs. Right now we use tags for that but maybe we could add a dummy pass through Op, like that is ultimately discarded but feeds initial type info. User could say Or we could introspect asserts like |
This is how sympy handles it, with their assumptions system. For example:
Gives a list of implications of
All of these have implications for their algebraic simplifications. For example, it will refuse to simplify It's actually a nice system but I read they were trying to revamp it, so I'm curious what they think the big failures of it are. It might be nice to try to set up a call with their devs to talk about their lessons from going this route before we commit to it. I also think something like this is really where egg rewrites can shine. Especially if we're just dealing with matrices, we shouldn't have to reason about shapes too much (although Blockwise somewhat ruins that...) |
Definitely |
hey @jessegrabowski! are there any more related PRs that i could try working on for this project? otherwise, from what i understand, this is mainly focused on graph rewriting? and we want to be using the computations suggested in the paper in pytensor |
Yeah basically we want to introduce simplifications related to compositions of linear algebra operations via rewrites. The COLA paper suggests a big pile, so that's what I suggested as a "roadmap". The only active one right now is #622 , you can look here for a rough template of a rewrite looks like. Maybe a "simple" rewrite to get your feet wet would be to change det(diag(x)) to prod(x). The rewrite should look for a Obviously there are other ways we could look for a diagonal matrix to arise (for example, eye * x), but that would be a good place to get the ball rolling. |
alright i understand. |
hey @jessegrabowski |
Maybe this video by @ricardoV94 would be nice to check out. He goes though making a rewrite live and explains a lot about how pytensor works: |
This may also be useful: https://gist.github.com/ricardoV94/deaf7b18660588faac1c30bc5b31c011 |
these were both really helpful thanks a lot! |
If you search on the discourse you'll find other applications, that should give you a sense of what the structure is meant to be. I'm not sure what resources you get on the Google side. For breaking down the project, I'd start by looking table 1 in the COLA paper, and trying to classify the optimizations into "easy", "medium", "hard", thinking about:
... and so on. Based on that analysis, you can sort of form a rough roadmap of what you think will be realistic to tackle over the course of your GSoC. It will also give you some nice material for a discussion/introduction session, and highlight areas you might need to learn more about (for example different kinds of matrices/decompositions/solvers/etc that maybe you've never seen before) |
great! i have started working on a proposal; hopefully i can share a first draft by tomorrow so I can make revisions to it before the final deadline. in this issue, there is a comment about how assumptions are handled in sympy. As this project relies a lot on assumptions too regarding nature of matrices and operations, is there an already existing system (from what I can tell, this has been done using tags) or should I also present an idea for dealing with assumptions (maybe similar to something like sympy) another thing i wanted to confirm is the duration of the project. the site states that all these projects will be 350hrs in duration so I am going with that itself unless any change is required? |
I don't think we should charge headfirst into a big assumptions system, we should just see how far we can get with pure graph introspection. I think there's already a lot we can do with this simpler approach. Regarding time, I guess I have no idea. It's up to you how big of a commitment you want to make, but certainly this is going to be a big effort. |
another quick question, is the list of shortcuts an exhaustive one? the table 1 in the paper has a lot more operations but have they been excluded on purpose because they are not that relevant here. |
hello @jessegrabowski @ricardoV94! should i share my proposal on the discourse for feedback or just privately with one of the mentors? |
I think discourse makes the most sense. Sorry I didn't see your other message last night. I think being really complete would be nice, and we can pair it down from there if we agree it's too repetitive/messy. |
cool cool. im gonna share it there then! |
hey! did you guys get time to take a look at my proposal. I have exams in
my college from day after so I'd really appreciate it if I could make any
updates today itself.
…On Fri, 29 Mar, 2024, 20:08 Jesse Grabowski, ***@***.***> wrote:
I think discourse makes the most sense.
Sorry I didn't see your other message last night. I think being really
complete would be nice, and we can pair it down from there if we agree it's
too repetitive/messy.
—
Reply to this email directly, view it on GitHub
<#573 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AV6XKQ5NIBJBGNGWMR75LGLY2VVEPAVCNFSM6AAAAABBO4S2Z6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMRXGMZTCMZSGA>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Looks great. Good luck with your exams! |
thanks! is there anything you guys found lacking or should i submit this as-is? |
We think it's good as is |
hey @jessegrabowski i was trying to make the list along with link to the corresponding CoLa rewrites but it seems like all the ones listed in the paper are not directly there in code? |
COLA doesn't implement everything I don't think. If you can't find it just make a note that it doesn't seem to be implemented and move on. You need to hurry up and submit! |
I have made a list here. Can this be a new PR which I can add in my proposal or should i just include the list itself. |
Include the list in your proposal and I'll edit this post to include the links (crediting you in the first post of course) |
I'm interested in participating in this project for GSOC 2025. Are the open issues up to date and available to work on this summer? |
@jessegrabowski Is there a place to start for accomodating harder to indentify types for the rewrites. Like making custom classes for them or starting with a basic type inference system. Any suggestions would be appreciated. |
Description
This paper and this library describe and implement a number of linear algebra simplifications that we can implement as graph rewrites. This issue is both a tracker for implementing these rewrites, and a discussion for how to handle them.
Consider the following graph:
If we could promise at rewrite time that
y
is diagonal, we could re-write the last operation asz = 1/y
, exploiting the structure of the diagonal matrix. Other non-trivial examples exist, for example:If we could promise at rewrite time that
z
is block diagonal, we could rewritez_inv = pt.kron(pt.linalg.inv(x), y)
, which is a much faster operation (sincex
is 3x smaller thanz
).The linked paper and library list a huge number of such shortcut computations. The following is a list version of Table 1 from the paper. Under each function are the types of matrices for which a rewrite rule exists to accelerate the function.
It would be nice to collaborative update this list with links to the COLA library where the relevant rewrite is implemented:Thanks to @tanish1729 for compiling a list of links to relevant rewrites. Missing links indicate no direct implementation
cholesky(A)
inv(A)
eigs(A)
diag(A)/trace(A)
logdet(A)
Unary Operations (such as log, pow, exp, sqrt)
In addition, potential re-writes could also be applied to Topelitz and Circulant matrices, although these are not covered by COLA.
The wrinkle to all this is that we would need more information about matrices as they enter and exit
Ops
. Right now, we're using tags to accomplish rewrites like this, see for example #303 and #459 . Some of these rewrites might be possible to do via inference. For example, apt.linalg.block_diag
Op
always returns a block diagonal matrix, as doespt.kron(pt.eye, A)
.pt.diagonal
always returns a diagonal matrix, as doespt.eye
;pt.linalg.cholesky
always returns a triangular matrix, etc. Other potential types, likeblock
,positive
,definite
,psd
,topelitz
,circulant
, etc, would be less trivial to automatically detect.The other issue is that as these type tags proliferate, we become more and more locked into a somewhat hack-y system for marking things. Perhaps putting some thought into how to handle this now will save some refactoring headaches down the road?
The text was updated successfully, but these errors were encountered: