Skip to content

Split Op returns view in Numba backend #343

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
aseyboldt opened this issue Jun 14, 2023 · 3 comments · Fixed by #344
Closed

Split Op returns view in Numba backend #343

aseyboldt opened this issue Jun 14, 2023 · 3 comments · Fixed by #344
Assignees
Labels
bug Something isn't working numba

Comments

@aseyboldt
Copy link
Member

Description

The numba backend seems to compile to an incorrect gradient graph:

import pytensor
import pymc as pm
import numpy as np

N, M = 2, 2

with pm.Model() as model:
    pm.ZeroSumNormal("x", n_zerosum_axes=1, shape=(N, M))

func1 = model.logp_dlogp_function(mode="NUMBA")
func2 = model.logp_dlogp_function()

func1.set_extra_values({})
func2.set_extra_values({})

x = np.ones((N, M - 1))  # np.random.randn(N, M - 1)
>>> func1._pytensor_function(x)
[array(-2.33787707),
 array([[-0.5       ],
        [-0.60355339]])]

>>> func2._pytensor_function(x)
[array(-2.33787707),
 array([[-0.5],
        [-0.5]])]
@ricardoV94
Copy link
Member

I'll take a look

@ricardoV94 ricardoV94 added bug Something isn't working numba gradients labels Jun 15, 2023
@ricardoV94
Copy link
Member

ricardoV94 commented Jun 15, 2023

Seems to be an issue with the Split Op returning a view and not a copy. Numba ends up writing in place of the input?

import pytensor
import pytensor.tensor as pt
import numpy as np

x1 = pt.matrix("x1")
x2 = pt.matrix("x2", shape=(None, 1))
v = pt.vector("v", shape=(2,), dtype=int)
out = pt.split(x1, v, n_splits=2, axis=1)[0] + x2

fn = pytensor.function([x1, x2, v], out, "NUMBA")
pytensor.dprint(fn, print_type=True)

# Add [id A] <Matrix(float64, shape=(?, ?))> 1
#  ├─ Split{2}.0 [id B] <Matrix(float64, shape=(?, ?))> 0
#  │  ├─ x1 [id C] <Matrix(float64, shape=(?, ?))>
#  │  ├─ 1 [id D] <Scalar(int8, shape=())>
#  │  └─ v [id E] <Vector(int64, shape=(2,))>
#  └─ x2 [id F] <Matrix(float64, shape=(?, 1))>

rng = np.random.default_rng(123)
test_x1 = rng.normal(size=(2, 2))
test_x1_copy = test_x1.copy()
test_x2 = rng.normal(size=(2, 1))
test_v = np.array([1, 1])

print(fn(test_x1, test_x2, test_v))  # [[-0.06889045], [ 1.86502905]]
print(fn(test_x1, test_x2, test_v))  # [[0.85134045], [2.44213284]]
print(test_x1_copy)  # [[-0.98912135 -0.36778665], [ 1.28792526  0.19397442]]
print(test_x1)  # [[ 0.85134045 -0.36778665] [ 2.44213284  0.19397442]] 

By the way, I made a small gist showing how I narrowed down the problem. Sharing because it could be useful in the future: https://gist.github.com/ricardoV94/ae4c51365b871713bc3cca735fe8fa2f

@ricardoV94
Copy link
Member

ricardoV94 commented Jun 15, 2023

Removing the inplace optimizations fixes the problem, because the Elemwise addition now creates a copy for the output.
AFAICT it should be the Split Op responsibility to make a copy?

import pytensor
import pymc as pm
import numpy as np

from pytensor.compile.mode import get_mode

N, M = 2, 2

with pm.Model(check_bounds=False) as model:
    pm.ZeroSumNormal("x", n_zerosum_axes=1, shape=(N, M))

graph = model.dlogp()
test_value = np.ones((N, M-1))

mode = get_mode("NUMBA").excluding("inplace")
pytensor.function(pm.inputvars([graph]), graph, mode=mode)(test_value)  # array([-0.5, -0.5])

@ricardoV94 ricardoV94 self-assigned this Jun 15, 2023
@ricardoV94 ricardoV94 changed the title Invalid gradient values in numba backend Split Op returns view in Numba backend Jun 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working numba
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants