You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Analyzing graphs with reshape operations is rather complex because Reshape represents what we want, but not "what it means"".
Except for esoteric cases where Reshape shapes may come from a complex computation / shapes of other variables, it is usually a case of multiplying some dimensions (merging) and diving others (splitting). We could represent these cases with some sort of symbolic mapping:
Still an Op that achieves the same as splitting via reshape but knows which dims are going where (and in what quantities), would be more readable
An example where Reshape is currently hard to work with is during vectorization. If we have a common graph like reshape(x, x.shape[0] * x.shape[1], -1) we cannot return the desired output reshape(new_x, x.shape[0], x.shape[1] * x.shape[2], -1) eagerly because there is a chain of complex operations we must vectorize before we get to the Reshape node (Shape -> Subtensor -> Mul -> MakeVector). So we need to put it in a costly Blockwise and try our best to remove it during rewrites. This came up in #702 when vectorizing tensordot to get a batched_tensordot
Such a problem wouldn't exist with a symbolic reshape that is told what dims are being joined/split.
It also makes rewrites to remove/lift reshapes much simpler than they currently are:
# TODO later: if all the shapes except one match, we may want to
# consider it useless as well, like we do in the 1-dim case.
returnFalse
This is somewhat related to why we have Second and Alloc. The first one is easier to reason about because it tells us more immediately that we are broadcasting with the shape of a variable, whereas Alloc specifies the desired output without its meaning (specially after some rewrites, where the shape may become dissociated from the original variable)
Uh oh!
There was an error while loading. Please reload this page.
Description
Analyzing graphs with reshape operations is rather complex because Reshape represents what we want, but not "what it means"".
Except for esoteric cases where
Reshape
shapes may come from a complex computation / shapes of other variables, it is usually a case of multiplying some dimensions (merging) and diving others (splitting). We could represent these cases with some sort of symbolic mapping:It almost begs for an extension of
DimShuffle
, which was brought up before: Theano/Theano#4640Splitting dims is trickier, because there are many choices, we can split in different orders and sizes
Still an Op that achieves the same as splitting via reshape but knows which dims are going where (and in what quantities), would be more readable
An example where Reshape is currently hard to work with is during vectorization. If we have a common graph like
reshape(x, x.shape[0] * x.shape[1], -1)
we cannot return the desired outputreshape(new_x, x.shape[0], x.shape[1] * x.shape[2], -1)
eagerly because there is a chain of complex operations we must vectorize before we get to theReshape
node (Shape
->Subtensor
->Mul
->MakeVector
). So we need to put it in a costly Blockwise and try our best to remove it during rewrites. This came up in #702 when vectorizingtensordot
to get abatched_tensordot
Such a problem wouldn't exist with a symbolic reshape that is told what dims are being joined/split.
It also makes rewrites to remove/lift reshapes much simpler than they currently are:
pytensor/pytensor/tensor/rewriting/shape.py
Lines 798 to 895 in bf73f8a
This is somewhat related to why we have
Second
andAlloc
. The first one is easier to reason about because it tells us more immediately that we are broadcasting with the shape of a variable, whereas Alloc specifies the desired output without its meaning (specially after some rewrites, where the shape may become dissociated from the original variable)pytensor/pytensor/tensor/rewriting/basic.py
Lines 3 to 23 in d62f4b1
The text was updated successfully, but these errors were encountered: