Skip to content

BUG: Errors when upgrading to gcc version 15 #1401

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
mvds314 opened this issue May 14, 2025 · 28 comments · Fixed by #1403
Closed

BUG: Errors when upgrading to gcc version 15 #1401

mvds314 opened this issue May 14, 2025 · 28 comments · Fixed by #1403
Labels
bug Something isn't working C-backend installation

Comments

@mvds314
Copy link

mvds314 commented May 14, 2025

Describe the issue:

I am getting compiler errors when I upgrade from from gcc 14 to 15 (under Windows using the gcc compiler of MSYS2).

I don't understand the details, but according to ChatGpt:

When PyTensor generates its C++ extension code (mod.cpp), it includes NumPy headers with:

#define NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION

With GCC 15, deprecated functions in NumPy's C-API now raise hard errors, whereas GCC 14 would compile with a warning.

Reproducable code example:

import pytensor
from pytensor import tensor as pt

a = pt.dscalar()
b = pt.dscalar()
c = a + b
f = pytensor.function([a, b], c)

assert 4.0 == f(1.5, 2.5)

Error message:

segmentation fault (core dumped) python

PyTensor version information:

2.30.3

Context for the issue:

No response

@mvds314 mvds314 added the bug Something isn't working label May 14, 2025
@ricardoV94
Copy link
Member

ricardoV94 commented May 14, 2025

May be related to #1398.

How did you install PyTensor?

Is there any error message/stacktrace?

ChatGPT answer is completely senseless, you would see a compile error in that case.

CC @maresb @lucianopaz

@maresb
Copy link
Contributor

maresb commented May 14, 2025

Hi @mvds314, please recreate your Conda environment. That will force gcc 14.

@ricardoV94
Copy link
Member

Would still be useful to know if it's the same error or different

@maresb
Copy link
Contributor

maresb commented May 14, 2025

whereas GCC 14 would compile with a warning

Also what's the warning?

@mvds314
Copy link
Author

mvds314 commented May 16, 2025

The error I got was so long, that I coudn't really figure out what the actual error is.
I solved it by downgrading my gcc in MSYS2 to version 14 as ChatGpt suggested, then everything works as expected.

Also, I am not using Anaconda, I am using winpython with Python version 3.13 and installed pytensor with pip.

You can find the C code in this temporary file: C:\Users\USERNAME\AppData\Local\Temp\pytensor_compilation_error_11j8i4uf
ERROR (pytensor.graph.rewriting.basic): Rewrite failure due to: constant_folding
ERROR (pytensor.graph.rewriting.basic): node: ExpandDims{axes=[0, 1]}(0.8)
ERROR (pytensor.graph.rewriting.basic): TRACEBACK:
ERROR (pytensor.graph.rewriting.basic): Traceback (most recent call last):
File "..\python\Lib\site-packages\pytensor\graph\rewriting\basic.py", line 1922, in process_node
replacements = node_rewriter.transform(fgraph, node)
File "..\python\Lib\site-packages\pytensor\graph\rewriting\basic.py", line 1086, in transform
return self.fn(fgraph, node)
~~~~~~~^^^^^^^^^^^^^^
File "..\python\Lib\site-packages\pytensor\tensor\rewriting\basic.py", line 1173, in constant_folding
return unconditional_constant_folding.transform(fgraph, node)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^
File "..\python\Lib\site-packages\pytensor\graph\rewriting\basic.py", line 1086, in transform
return self.fn(fgraph, node)
~~~~~~~^^^^^^^^^^^^^^
File "..\python\Lib\site-packages\pytensor\tensor\rewriting\basic.py", line 1122, in unconditional_constant_folding
thunk = node.op.make_thunk(node, storage_map, compute_map, no_recycling=[])
File "..\python\Lib\site-packages\pytensor\link\c\op.py", line 119, in make_thunk
return self.make_c_thunk(node, storage_map, compute_map, no_recycling)
~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "..\python\Lib\site-packages\pytensor\link\c\op.py", line 84, in make_c_thunk
outputs = cl.make_thunk(
input_storage=node_input_storage, output_storage=node_output_storage
)
File "..\python\Lib\site-packages\pytensor\link\c\basic.py", line 1185, in make_thunk
cthunk, module, in_storage, out_storage, error_storage = self.compile(
~~~~~~~~~~~~~~~~^
input_storage, output_storage, storage_map, cache
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
)
^
File "..\python\Lib\site-packages\pytensor\link\c\basic.py", line 1102, in compile
thunk, module = self.cthunk_factory(
~~~~~~~~~~~~~~~~~~~^
error_storage,
^^^^^^^^^^^^^^
...<3 lines>...
cache,
^^^^^^
)
^
File "..\python\Lib\site-packages\pytensor\link\c\basic.py", line 1626, in cthunk_factory
module = cache.module_from_key(key=key, lnk=self)
File "..\python\Lib\site-packages\pytensor\link\c\cmodule.py", line 1250, in module_from_key
module = lnk.compile_cmodule(location)
File "..\python\Lib\site-packages\pytensor\link\c\basic.py", line 1527, in compile_cmodule
module = c_compiler.compile_str(
module_name=mod.code_hash,
...<5 lines>...
preargs=preargs,
)
File "..\python\Lib\site-packages\pytensor\link\c\cmodule.py", line 2676, in compile_str
raise CompileError(
f"Compilation failed (return status={status}):\n{' '.join(cmd)}\n{compile_stderr}"
)
pytensor.link.c.exceptions.CompileError: Compilation failed (return status=1):
"..\msys64\mingw64\bin\g++.EXE" -shared -g -O3 -fno-math-errno -Wno-unused-label -Wno-unused-variable -Wno-write-strings -Wno-c++11-narrowing -fno-exceptions -fno-unwind-tables -fno-asynchronous-unwind-tables -DNPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION -m64 -DMS_WIN64 -I"..\python\Lib\site-packages\numpy_core\include" -I"..\python\include" -I"..\python\Lib\site-packages\pytensor\link\c\c_code" -L"..\python\libs" -L"..\python" -o "C:\Users\USERNAME\AppData\Local\PyTensor\compiledir_Windows-11-10.0.22631-SP0-Intel64_Family_6_Model_186_Stepping_2_GenuineIntel-3.13.3-64\tmpfs1rbpul\m52592e18fb66ba378e5be6fe9bccf91a7fa21d4d238285546214d2f3b9d3aec5.pyd" "C:\Users\USERNAME\AppData\Local\PyTensor\compiledir_Windows-11-10.0.22631-SP0-Intel64_Family_6_Model_186_Stepping_2_GenuineIntel-3.13.3-64\tmpfs1rbpul\mod.cpp" "..\python\python313.dll"

You can find the C code in this temporary file: C:\Users\USERNAME\AppData\Local\Temp\pytensor_compilation_error_elvweyp2
ERROR (pytensor.graph.rewriting.basic): Rewrite failure due to: constant_folding
ERROR (pytensor.graph.rewriting.basic): node: ExpandDims{axes=[0, 1]}(0.4)
ERROR (pytensor.graph.rewriting.basic): TRACEBACK:
ERROR (pytensor.graph.rewriting.basic): Traceback (most recent call last):
File "..\python\Lib\site-packages\pytensor\graph\rewriting\basic.py", line 1922, in process_node
replacements = node_rewriter.transform(fgraph, node)
File "..\python\Lib\site-packages\pytensor\graph\rewriting\basic.py", line 1086, in transform
return self.fn(fgraph, node)
~~~~~~~^^^^^^^^^^^^^^
File "..\python\Lib\site-packages\pytensor\tensor\rewriting\basic.py", line 1173, in constant_folding
return unconditional_constant_folding.transform(fgraph, node)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^
File "..\python\Lib\site-packages\pytensor\graph\rewriting\basic.py", line 1086, in transform
return self.fn(fgraph, node)
~~~~~~~^^^^^^^^^^^^^^
File "..\python\Lib\site-packages\pytensor\tensor\rewriting\basic.py", line 1122, in unconditional_constant_folding
thunk = node.op.make_thunk(node, storage_map, compute_map, no_recycling=[])
File "..\python\Lib\site-packages\pytensor\link\c\op.py", line 119, in make_thunk
return self.make_c_thunk(node, storage_map, compute_map, no_recycling)
~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "..\python\Lib\site-packages\pytensor\link\c\op.py", line 84, in make_c_thunk
outputs = cl.make_thunk(
input_storage=node_input_storage, output_storage=node_output_storage
)
File "..\python\Lib\site-packages\pytensor\link\c\basic.py", line 1185, in make_thunk
cthunk, module, in_storage, out_storage, error_storage = self.compile(
~~~~~~~~~~~~~~~~^
input_storage, output_storage, storage_map, cache
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
)
^
File "..\python\Lib\site-packages\pytensor\link\c\basic.py", line 1102, in compile
thunk, module = self.cthunk_factory(
~~~~~~~~~~~~~~~~~~~^
error_storage,
^^^^^^^^^^^^^^
...<3 lines>...
cache,
^^^^^^
)
^
File "..\python\Lib\site-packages\pytensor\link\c\basic.py", line 1626, in cthunk_factory
module = cache.module_from_key(key=key, lnk=self)
File "..\python\Lib\site-packages\pytensor\link\c\cmodule.py", line 1250, in module_from_key
module = lnk.compile_cmodule(location)
File "..\python\Lib\site-packages\pytensor\link\c\basic.py", line 1527, in compile_cmodule
module = c_compiler.compile_str(
module_name=mod.code_hash,
...<5 lines>...
preargs=preargs,
)
File "..\python\Lib\site-packages\pytensor\link\c\cmodule.py", line 2676, in compile_str
raise CompileError(
f"Compilation failed (return status={status}):\n{' '.join(cmd)}\n{compile_stderr}"
)
pytensor.link.c.exceptions.CompileError: Compilation failed (return status=1):
"..\msys64\mingw64\bin\g++.EXE" -shared -g -O3 -fno-math-errno -Wno-unused-label -Wno-unused-variable -Wno-write-strings -Wno-c++11-narrowing -fno-exceptions -fno-unwind-tables -fno-asynchronous-unwind-tables -DNPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION -m64 -DMS_WIN64 -I"..\python\Lib\site-packages\numpy_core\include" -I"..\python\include" -I"..\python\Lib\site-packages\pytensor\link\c\c_code" -L"..\python\libs" -L"..\python" -o "C:\Users\USERNAME\AppData\Local\PyTensor\compiledir_Windows-11-10.0.22631-SP0-Intel64_Family_6_Model_186_Stepping_2_GenuineIntel-3.13.3-64\tmpryywwxq_\m52592e18fb66ba378e5be6fe9bccf91a7fa21d4d238285546214d2f3b9d3aec5.pyd" "C:\Users\USERNAME\AppData\Local\PyTensor\compiledir_Windows-11-10.0.22631-SP0-Intel64_Family_6_Model_186_Stepping_2_GenuineIntel-3.13.3-64\tmpryywwxq_\mod.cpp" "..\python\python313.dll"

You can find the C code in this temporary file: C:\Users\USERNAME\AppData\Local\Temp\pytensor_compilation_error_m23915ax
ERROR (pytensor.graph.rewriting.basic): Rewrite failure due to: constant_folding
ERROR (pytensor.graph.rewriting.basic): node: ExpandDims{axes=[0, 1]}(0.8)
ERROR (pytensor.graph.rewriting.basic): TRACEBACK:
ERROR (pytensor.graph.rewriting.basic): Traceback (most recent call last):
File "..\python\Lib\site-packages\pytensor\graph\rewriting\basic.py", line 1922, in process_node
replacements = node_rewriter.transform(fgraph, node)
File "..\python\Lib\site-packages\pytensor\graph\rewriting\basic.py", line 1086, in transform
return self.fn(fgraph, node)
~~~~~~~^^^^^^^^^^^^^^
File "..\python\Lib\site-packages\pytensor\tensor\rewriting\basic.py", line 1173, in constant_folding
return unconditional_constant_folding.transform(fgraph, node)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^
File "..\python\Lib\site-packages\pytensor\graph\rewriting\basic.py", line 1086, in transform
return self.fn(fgraph, node)
~~~~~~~^^^^^^^^^^^^^^
File "..\python\Lib\site-packages\pytensor\tensor\rewriting\basic.py", line 1122, in unconditional_constant_folding
thunk = node.op.make_thunk(node, storage_map, compute_map, no_recycling=[])
File "..\python\Lib\site-packages\pytensor\link\c\op.py", line 119, in make_thunk
return self.make_c_thunk(node, storage_map, compute_map, no_recycling)
~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "..\python\Lib\site-packages\pytensor\link\c\op.py", line 84, in make_c_thunk
outputs = cl.make_thunk(
input_storage=node_input_storage, output_storage=node_output_storage
)
File "..\python\Lib\site-packages\pytensor\link\c\basic.py", line 1185, in make_thunk
cthunk, module, in_storage, out_storage, error_storage = self.compile(
~~~~~~~~~~~~~~~~^
input_storage, output_storage, storage_map, cache
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
)
^
File "..\python\Lib\site-packages\pytensor\link\c\basic.py", line 1102, in compile
thunk, module = self.cthunk_factory(
~~~~~~~~~~~~~~~~~~~^
error_storage,
^^^^^^^^^^^^^^
...<3 lines>...
cache,
^^^^^^
)
^
File "..\python\Lib\site-packages\pytensor\link\c\basic.py", line 1626, in cthunk_factory
module = cache.module_from_key(key=key, lnk=self)
File "..\python\Lib\site-packages\pytensor\link\c\cmodule.py", line 1250, in module_from_key
module = lnk.compile_cmodule(location)
File "..\python\Lib\site-packages\pytensor\link\c\basic.py", line 1527, in compile_cmodule
module = c_compiler.compile_str(
module_name=mod.code_hash,
...<5 lines>...
preargs=preargs,
)
File "..\python\Lib\site-packages\pytensor\link\c\cmodule.py", line 2676, in compile_str
raise CompileError(
f"Compilation failed (return status={status}):\n{' '.join(cmd)}\n{compile_stderr}"
)
pytensor.link.c.exceptions.CompileError: Compilation failed (return status=1):
"..\msys64\mingw64\bin\g++.EXE" -shared -g -O3 -fno-math-errno -Wno-unused-label -Wno-unused-variable -Wno-write-strings -Wno-c++11-narrowing -fno-exceptions -fno-unwind-tables -fno-asynchronous-unwind-tables -DNPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION -m64 -DMS_WIN64 -I"..\python\Lib\site-packages\numpy_core\include" -I"..\python\include" -I"..\python\Lib\site-packages\pytensor\link\c\c_code" -L"..\python\libs" -L"..\python" -o "C:\Users\USERNAME\AppData\Local\PyTensor\compiledir_Windows-11-10.0.22631-SP0-Intel64_Family_6_Model_186_Stepping_2_GenuineIntel-3.13.3-64\tmpbjrxt6xo\m52592e18fb66ba378e5be6fe9bccf91a7fa21d4d238285546214d2f3b9d3aec5.pyd" "C:\Users\USERNAME\AppData\Local\PyTensor\compiledir_Windows-11-10.0.22631-SP0-Intel64_Family_6_Model_186_Stepping_2_GenuineIntel-3.13.3-64\tmpbjrxt6xo\mod.cpp" "..\python\python313.dll"

You can find the C code in this temporary file: C:\Users\USERNAME\AppData\Local\Temp\pytensor_compilation_error_uqafz_v9
ERROR (pytensor.graph.rewriting.basic): Rewrite failure due to: constant_folding
ERROR (pytensor.graph.rewriting.basic): node: ExpandDims{axes=[0, 1]}(0.4)
ERROR (pytensor.graph.rewriting.basic): TRACEBACK:
ERROR (pytensor.graph.rewriting.basic): Traceback (most recent call last):
File "..\python\Lib\site-packages\pytensor\graph\rewriting\basic.py", line 1922, in process_node
replacements = node_rewriter.transform(fgraph, node)
File "..\python\Lib\site-packages\pytensor\graph\rewriting\basic.py", line 1086, in transform
return self.fn(fgraph, node)
~~~~~~~^^^^^^^^^^^^^^
File "..\python\Lib\site-packages\pytensor\tensor\rewriting\basic.py", line 1173, in constant_folding
return unconditional_constant_folding.transform(fgraph, node)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^
File "..\python\Lib\site-packages\pytensor\graph\rewriting\basic.py", line 1086, in transform
return self.fn(fgraph, node)
~~~~~~~^^^^^^^^^^^^^^
File "..\python\Lib\site-packages\pytensor\tensor\rewriting\basic.py", line 1122, in unconditional_constant_folding
thunk = node.op.make_thunk(node, storage_map, compute_map, no_recycling=[])
File "..\python\Lib\site-packages\pytensor\link\c\op.py", line 119, in make_thunk
return self.make_c_thunk(node, storage_map, compute_map, no_recycling)
~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "..\python\Lib\site-packages\pytensor\link\c\op.py", line 84, in make_c_thunk
outputs = cl.make_thunk(
input_storage=node_input_storage, output_storage=node_output_storage
)
File "..\python\Lib\site-packages\pytensor\link\c\basic.py", line 1185, in make_thunk
cthunk, module, in_storage, out_storage, error_storage = self.compile(
~~~~~~~~~~~~~~~~^
input_storage, output_storage, storage_map, cache
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
)
^
File "..\python\Lib\site-packages\pytensor\link\c\basic.py", line 1102, in compile
thunk, module = self.cthunk_factory(
~~~~~~~~~~~~~~~~~~~^
error_storage,
^^^^^^^^^^^^^^
...<3 lines>...
cache,
^^^^^^
)
^
File "..\python\Lib\site-packages\pytensor\link\c\basic.py", line 1626, in cthunk_factory
module = cache.module_from_key(key=key, lnk=self)
File "..\python\Lib\site-packages\pytensor\link\c\cmodule.py", line 1250, in module_from_key
module = lnk.compile_cmodule(location)
File "..\python\Lib\site-packages\pytensor\link\c\basic.py", line 1527, in compile_cmodule
module = c_compiler.compile_str(
module_name=mod.code_hash,
...<5 lines>...
preargs=preargs,
)
File "..\python\Lib\site-packages\pytensor\link\c\cmodule.py", line 2676, in compile_str
raise CompileError(
f"Compilation failed (return status={status}):\n{' '.join(cmd)}\n{compile_stderr}"
)
pytensor.link.c.exceptions.CompileError: Compilation failed (return status=1):
"..\msys64\mingw64\bin\g++.EXE" -shared -g -O3 -fno-math-errno -Wno-unused-label -Wno-unused-variable -Wno-write-strings -Wno-c++11-narrowing -fno-exceptions -fno-unwind-tables -fno-asynchronous-unwind-tables -DNPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION -m64 -DMS_WIN64 -I"..\python\Lib\site-packages\numpy_core\include" -I"..\python\include" -I"..\python\Lib\site-packages\pytensor\link\c\c_code" -L"..\python\libs" -L"..\python" -o "C:\Users\USERNAME\AppData\Local\PyTensor\compiledir_Windows-11-10.0.22631-SP0-Intel64_Family_6_Model_186_Stepping_2_GenuineIntel-3.13.3-64\tmpadko9n0z\m52592e18fb66ba378e5be6fe9bccf91a7fa21d4d238285546214d2f3b9d3aec5.pyd" "C:\Users\USERNAME\AppData\Local\PyTensor\compiledir_Windows-11-10.0.22631-SP0-Intel64_Family_6_Model_186_Stepping_2_GenuineIntel-3.13.3-64\tmpadko9n0z\mod.cpp" "..\python\python313.dll"

You can find the C code in this temporary file: C:\Users\USERNAME\AppData\Local\Temp\pytensor_compilation_error__gtv7gqm
ERROR (pytensor.graph.rewriting.basic): Rewrite failure due to: constant_folding
ERROR (pytensor.graph.rewriting.basic): node: ExpandDims{axes=[0, 1]}(0.8)
ERROR (pytensor.graph.rewriting.basic): TRACEBACK:
ERROR (pytensor.graph.rewriting.basic): Traceback (most recent call last):
File "..\python\Lib\site-packages\pytensor\graph\rewriting\basic.py", line 1922, in process_node
replacements = node_rewriter.transform(fgraph, node)
File "..\python\Lib\site-packages\pytensor\graph\rewriting\basic.py", line 1086, in transform
return self.fn(fgraph, node)
~~~~~~~^^^^^^^^^^^^^^
File "..\python\Lib\site-packages\pytensor\tensor\rewriting\basic.py", line 1173, in constant_folding
return unconditional_constant_folding.transform(fgraph, node)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^
File "..\python\Lib\site-packages\pytensor\graph\rewriting\basic.py", line 1086, in transform
return self.fn(fgraph, node)
~~~~~~~^^^^^^^^^^^^^^
File "..\python\Lib\site-packages\pytensor\tensor\rewriting\basic.py", line 1122, in unconditional_constant_folding
thunk = node.op.make_thunk(node, storage_map, compute_map, no_recycling=[])
File "..\python\Lib\site-packages\pytensor\link\c\op.py", line 119, in make_thunk
return self.make_c_thunk(node, storage_map, compute_map, no_recycling)
~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "..\python\Lib\site-packages\pytensor\link\c\op.py", line 84, in make_c_thunk
outputs = cl.make_thunk(
input_storage=node_input_storage, output_storage=node_output_storage
)
File "..\python\Lib\site-packages\pytensor\link\c\basic.py", line 1185, in make_thunk
cthunk, module, in_storage, out_storage, error_storage = self.compile(
~~~~~~~~~~~~~~~~^
input_storage, output_storage, storage_map, cache
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
)
^
File "..\python\Lib\site-packages\pytensor\link\c\basic.py", line 1102, in compile
thunk, module = self.cthunk_factory(
~~~~~~~~~~~~~~~~~~~^
error_storage,
^^^^^^^^^^^^^^
...<3 lines>...
cache,
^^^^^^
)
^
File "..\python\Lib\site-packages\pytensor\link\c\basic.py", line 1626, in cthunk_factory
module = cache.module_from_key(key=key, lnk=self)
File "..\python\Lib\site-packages\pytensor\link\c\cmodule.py", line 1250, in module_from_key
module = lnk.compile_cmodule(location)
File "..\python\Lib\site-packages\pytensor\link\c\basic.py", line 1527, in compile_cmodule
module = c_compiler.compile_str(
module_name=mod.code_hash,
...<5 lines>...
preargs=preargs,
)
File "..\python\Lib\site-packages\pytensor\link\c\cmodule.py", line 2676, in compile_str
raise CompileError(
f"Compilation failed (return status={status}):\n{' '.join(cmd)}\n{compile_stderr}"
)
pytensor.link.c.exceptions.CompileError: Compilation failed (return status=1):
"..\msys64\mingw64\bin\g++.EXE" -shared -g -O3 -fno-math-errno -Wno-unused-label -Wno-unused-variable -Wno-write-strings -Wno-c++11-narrowing -fno-exceptions -fno-unwind-tables -fno-asynchronous-unwind-tables -DNPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION -m64 -DMS_WIN64 -I"..\python\Lib\site-packages\numpy_core\include" -I"..\python\include" -I"..\python\Lib\site-packages\pytensor\link\c\c_code" -L"..\python\libs" -L"..\python" -o "C:\Users\USERNAME\AppData\Local\PyTensor\compiledir_Windows-11-10.0.22631-SP0-Intel64_Family_6_Model_186_Stepping_2_GenuineIntel-3.13.3-64\tmpdh9_sens\m52592e18fb66ba378e5be6fe9bccf91a7fa21d4d238285546214d2f3b9d3aec5.pyd" "C:\Users\USERNAME\AppData\Local\PyTensor\compiledir_Windows-11-10.0.22631-SP0-Intel64_Family_6_Model_186_Stepping_2_GenuineIntel-3.13.3-64\tmpdh9_sens\mod.cpp" "..\python\python313.dll"

You can find the C code in this temporary file: C:\Users\USERNAME\AppData\Local\Temp\pytensor_compilation_error_m5iggf5r
ERROR (pytensor.graph.rewriting.basic): Rewrite failure due to: constant_folding
ERROR (pytensor.graph.rewriting.basic): node: ExpandDims{axes=[0, 1]}(0.4)
ERROR (pytensor.graph.rewriting.basic): TRACEBACK:
ERROR (pytensor.graph.rewriting.basic): Traceback (most recent call last):
File "..\python\Lib\site-packages\pytensor\graph\rewriting\basic.py", line 1922, in process_node
replacements = node_rewriter.transform(fgraph, node)
File "..\python\Lib\site-packages\pytensor\graph\rewriting\basic.py", line 1086, in transform
return self.fn(fgraph, node)
~~~~~~~^^^^^^^^^^^^^^
File "..\python\Lib\site-packages\pytensor\tensor\rewriting\basic.py", line 1173, in constant_folding
return unconditional_constant_folding.transform(fgraph, node)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^
File "..\python\Lib\site-packages\pytensor\graph\rewriting\basic.py", line 1086, in transform
return self.fn(fgraph, node)
~~~~~~~^^^^^^^^^^^^^^
File "..\python\Lib\site-packages\pytensor\tensor\rewriting\basic.py", line 1122, in unconditional_constant_folding
thunk = node.op.make_thunk(node, storage_map, compute_map, no_recycling=[])
File "..\python\Lib\site-packages\pytensor\link\c\op.py", line 119, in make_thunk
return self.make_c_thunk(node, storage_map, compute_map, no_recycling)
~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "..\python\Lib\site-packages\pytensor\link\c\op.py", line 84, in make_c_thunk
outputs = cl.make_thunk(
input_storage=node_input_storage, output_storage=node_output_storage
)
File "..\python\Lib\site-packages\pytensor\link\c\basic.py", line 1185, in make_thunk
cthunk, module, in_storage, out_storage, error_storage = self.compile(
~~~~~~~~~~~~~~~~^
input_storage, output_storage, storage_map, cache
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
)
^
File "..\python\Lib\site-packages\pytensor\link\c\basic.py", line 1102, in compile
thunk, module = self.cthunk_factory(
~~~~~~~~~~~~~~~~~~~^
error_storage,
^^^^^^^^^^^^^^
...<3 lines>...
cache,
^^^^^^
)
^
File "..\python\Lib\site-packages\pytensor\link\c\basic.py", line 1626, in cthunk_factory
module = cache.module_from_key(key=key, lnk=self)
File "..\python\Lib\site-packages\pytensor\link\c\cmodule.py", line 1250, in module_from_key
module = lnk.compile_cmodule(location)
File "..\python\Lib\site-packages\pytensor\link\c\basic.py", line 1527, in compile_cmodule
module = c_compiler.compile_str(
module_name=mod.code_hash,
...<5 lines>...
preargs=preargs,
)
File "..\python\Lib\site-packages\pytensor\link\c\cmodule.py", line 2676, in compile_str
raise CompileError(
f"Compilation failed (return status={status}):\n{' '.join(cmd)}\n{compile_stderr}"
)
pytensor.link.c.exceptions.CompileError: Compilation failed (return status=1):
"..\msys64\mingw64\bin\g++.EXE" -shared -g -O3 -fno-math-errno -Wno-unused-label -Wno-unused-variable -Wno-write-strings -Wno-c++11-narrowing -fno-exceptions -fno-unwind-tables -fno-asynchronous-unwind-tables -DNPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION -m64 -DMS_WIN64 -I"..\python\Lib\site-packages\numpy_core\include" -I"..\python\include" -I"..\python\Lib\site-packages\pytensor\link\c\c_code" -L"..\python\libs" -L"..\python" -o "C:\Users\USERNAME\AppData\Local\PyTensor\compiledir_Windows-11-10.0.22631-SP0-Intel64_Family_6_Model_186_Stepping_2_GenuineIntel-3.13.3-64\tmple51mlrc\m52592e18fb66ba378e5be6fe9bccf91a7fa21d4d238285546214d2f3b9d3aec5.pyd" "C:\Users\USERNAME\AppData\Local\PyTensor\compiledir_Windows-11-10.0.22631-SP0-Intel64_Family_6_Model_186_Stepping_2_GenuineIntel-3.13.3-64\tmple51mlrc\mod.cpp" "..\python\python313.dll"

You can find the C code in this temporary file: C:\Users\USERNAME\AppData\Local\Temp\pytensor_compilation_error_n5xsfake
Traceback (most recent call last):
File "..\python\Lib\site-packages\pytensor\link\vm.py", line 1230, in make_all
node.op.make_thunk(node, storage_map, compute_map, [], impl=impl)
~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "..\python\Lib\site-packages\pytensor\link\c\op.py", line 119, in make_thunk
return self.make_c_thunk(node, storage_map, compute_map, no_recycling)
~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "..\python\Lib\site-packages\pytensor\link\c\op.py", line 84, in make_c_thunk
outputs = cl.make_thunk(
input_storage=node_input_storage, output_storage=node_output_storage
)
File "..\python\Lib\site-packages\pytensor\link\c\basic.py", line 1185, in make_thunk
cthunk, module, in_storage, out_storage, error_storage = self.compile(
~~~~~~~~~~~~~~~~^
input_storage, output_storage, storage_map, cache
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
)
^
File "..\python\Lib\site-packages\pytensor\link\c\basic.py", line 1102, in compile
thunk, module = self.cthunk_factory(
~~~~~~~~~~~~~~~~~~~^
error_storage,
^^^^^^^^^^^^^^
...<3 lines>...
cache,
^^^^^^
)
^
File "..\python\Lib\site-packages\pytensor\link\c\basic.py", line 1626, in cthunk_factory
module = cache.module_from_key(key=key, lnk=self)
File "..\python\Lib\site-packages\pytensor\link\c\cmodule.py", line 1250, in module_from_key
module = lnk.compile_cmodule(location)
File "..\python\Lib\site-packages\pytensor\link\c\basic.py", line 1527, in compile_cmodule
module = c_compiler.compile_str(
module_name=mod.code_hash,
...<5 lines>...
preargs=preargs,
)
File "..\python\Lib\site-packages\pytensor\link\c\cmodule.py", line 2676, in compile_str
raise CompileError(
f"Compilation failed (return status={status}):\n{' '.join(cmd)}\n{compile_stderr}"
)
pytensor.link.c.exceptions.CompileError: Compilation failed (return status=1):
"..\msys64\mingw64\bin\g++.EXE" -shared -g -O3 -fno-math-errno -Wno-unused-label -Wno-unused-variable -Wno-write-strings -Wno-c++11-narrowing -fno-exceptions -fno-unwind-tables -fno-asynchronous-unwind-tables -DNPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION -m64 -DMS_WIN64 -I"..\python\Lib\site-packages\numpy_core\include" -I"..\python\include" -I"..\python\Lib\site-packages\pytensor\link\c\c_code" -L"..\python\libs" -L"..\python" -o "C:\Users\USERNAME\AppData\Local\PyTensor\compiledir_Windows-11-10.0.22631-SP0-Intel64_Family_6_Model_186_Stepping_2_GenuineIntel-3.13.3-64\tmp_oc9l42q\me7fa920728e5d2389ff334307defa22f3d9bd0933d7665ee04ca7b89f0a69b2d.pyd" "C:\Users\USERNAME\AppData\Local\PyTensor\compiledir_Windows-11-10.0.22631-SP0-Intel64_Family_6_Model_186_Stepping_2_GenuineIntel-3.13.3-64\tmp_oc9l42q\mod.cpp" -lopenblas -lgfortran "..\python\python313.dll"

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "c:\Users\USERNAME\Repos\TCM\calibrations\TCM4\check_blas.py", line 13, in
exec(open(file).read())
~~~~^^^^^^^^^^^^^^^^^^^
File "", line 274, in
File "", line 57, in execute
File "..\python\Lib\site-packages\pytensor\compile\function_init_.py", line 332, in function
fn = pfunc(
params=inputs,
...<12 lines>...
trust_input=trust_input,
)
File "..\python\Lib\site-packages\pytensor\compile\function\pfunc.py", line 466, in pfunc
return orig_function(
inputs,
...<8 lines>...
trust_input=trust_input,
)
File "..\python\Lib\site-packages\pytensor\compile\function\types.py", line 1833, in orig_function
fn = m.create(defaults)
File "..\python\Lib\site-packages\pytensor\compile\function\types.py", line 1717, in create
_fn, _i, _o = self.linker.make_thunk(
~~~~~~~~~~~~~~~~~~~~~~^
input_storage=input_storage_lists, storage_map=storage_map
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
)
^
File "..\python\Lib\site-packages\pytensor\link\basic.py", line 245, in make_thunk
return self.make_all(
~~~~~~~~~~~~~^
input_storage=input_storage,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
output_storage=output_storage,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
storage_map=storage_map,
^^^^^^^^^^^^^^^^^^^^^^^^
)[:3]
^
File "..\python\Lib\site-packages\pytensor\link\vm.py", line 1239, in make_all
raise_with_op(fgraph, node)
~~~~~~~~~~~~~^^^^^^^^^^^^^^
File "..\python\Lib\site-packages\pytensor\link\utils.py", line 526, in raise_with_op
raise exc_value.with_traceback(exc_trace)
File "..\python\Lib\site-packages\pytensor\link\vm.py", line 1230, in make_all
node.op.make_thunk(node, storage_map, compute_map, [], impl=impl)
~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "..\python\Lib\site-packages\pytensor\link\c\op.py", line 119, in make_thunk
return self.make_c_thunk(node, storage_map, compute_map, no_recycling)
~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "..\python\Lib\site-packages\pytensor\link\c\op.py", line 84, in make_c_thunk
outputs = cl.make_thunk(
input_storage=node_input_storage, output_storage=node_output_storage
)
File "..\python\Lib\site-packages\pytensor\link\c\basic.py", line 1185, in make_thunk
cthunk, module, in_storage, out_storage, error_storage = self.compile(
~~~~~~~~~~~~~~~~^
input_storage, output_storage, storage_map, cache
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
)
^
File "..\python\Lib\site-packages\pytensor\link\c\basic.py", line 1102, in compile
thunk, module = self.cthunk_factory(
~~~~~~~~~~~~~~~~~~~^
error_storage,
^^^^^^^^^^^^^^
...<3 lines>...
cache,
^^^^^^
)
^
File "..\python\Lib\site-packages\pytensor\link\c\basic.py", line 1626, in cthunk_factory
module = cache.module_from_key(key=key, lnk=self)
File "..\python\Lib\site-packages\pytensor\link\c\cmodule.py", line 1250, in module_from_key
module = lnk.compile_cmodule(location)
File "..\python\Lib\site-packages\pytensor\link\c\basic.py", line 1527, in compile_cmodule
module = c_compiler.compile_str(
module_name=mod.code_hash,
...<5 lines>...
preargs=preargs,
)
File "..\python\Lib\site-packages\pytensor\link\c\cmodule.py", line 2676, in compile_str
raise CompileError(
f"Compilation failed (return status={status}):\n{' '.join(cmd)}\n{compile_stderr}"
)
pytensor.link.c.exceptions.CompileError: Compilation failed (return status=1):
"..\msys64\mingw64\bin\g++.EXE" -shared -g -O3 -fno-math-errno -Wno-unused-label -Wno-unused-variable -Wno-write-strings -Wno-c++11-narrowing -fno-exceptions -fno-unwind-tables -fno-asynchronous-unwind-tables -DNPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION -m64 -DMS_WIN64 -I"..\python\Lib\site-packages\numpy_core\include" -I"..\python\include" -I"..\python\Lib\site-packages\pytensor\link\c\c_code" -L"..\python\libs" -L"..\python" -o "C:\Users\USERNAME\AppData\Local\PyTensor\compiledir_Windows-11-10.0.22631-SP0-Intel64_Family_6_Model_186_Stepping_2_GenuineIntel-3.13.3-64\tmp_oc9l42q\me7fa920728e5d2389ff334307defa22f3d9bd0933d7665ee04ca7b89f0a69b2d.pyd" "C:\Users\USERNAME\AppData\Local\PyTensor\compiledir_Windows-11-10.0.22631-SP0-Intel64_Family_6_Model_186_Stepping_2_GenuineIntel-3.13.3-64\tmp_oc9l42q\mod.cpp" -lopenblas -lgfortran "..\python\python313.dll"

Apply node that caused the error: Gemm{inplace}(<Matrix(float64, shape=(?, ?))>, 0.8, <Matrix(float64, shape=(?, ?))>, <Matrix(float64, shape=(?, ?))>, 0.4)
Toposort index: 0
Inputs types: [TensorType(float64, shape=(None, None)), TensorType(float64, shape=()), TensorType(float64, shape=(None, None)), TensorType(float64, shape=(None, None)), TensorType(float64, shape=())]

@maresb
Copy link
Contributor

maresb commented May 16, 2025

Thanks @mvds314 for the additional info!

The output shows essentially the same error occurring over and over. If you're lucky you may be able to paste one of those long lines from your output (it should look like the following), and hopefully it executes and shows a more informative error message:

"..\msys64\mingw64\bin\g++.EXE" -shared -g -O3 -fno-math-errno -Wno-unused-label -Wno-unused-variable -Wno-write-strings -Wno-c++11-narrowing -fno-exceptions -fno-unwind-tables -fno-asynchronous-unwind-tables -DNPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION -m64 -DMS_WIN64 -I"..\python\Lib\site-packages\numpy_core\include" -I"..\python\include" -I"..\python\Lib\site-packages\pytensor\link\c\c_code" -L"..\python\libs" -L"..\python" -o "C:\Users\USERNAME\AppData\Local\PyTensor\compiledir_Windows-11-10.0.22631-SP0-Intel64_Family_6_Model_186_Stepping_2_GenuineIntel-3.13.3-64\tmp_oc9l42q\me7fa920728e5d2389ff334307defa22f3d9bd0933d7665ee04ca7b89f0a69b2d.pyd" "C:\Users\USERNAME\AppData\Local\PyTensor\compiledir_Windows-11-10.0.22631-SP0-Intel64_Family_6_Model_186_Stepping_2_GenuineIntel-3.13.3-64\tmp_oc9l42q\mod.cpp" -lopenblas -lgfortran "..\python\python313.dll"

My expectation is that this is somehow a duplicate of #1398, but I'm surprised that you're not getting the same ImportError: DLL load failed while importing.

@mvds314
Copy link
Author

mvds314 commented May 16, 2025

When I do this, I get a popup window with

The procedure entry point clock_gettime64 could not be located in the dynamic link library

C:\....\msys64\mingw64\bin\...\lib\gcc\x86_64-w64-mingw32/15.1.0\cc1plus.exe

@maresb
Copy link
Contributor

maresb commented May 17, 2025

Thanks a lot @mvds314 for the extra info, that is indeed extremely helpful. Based on that I'm proposing a possible fix in #1403.

@maresb
Copy link
Contributor

maresb commented May 19, 2025

Reopening because I think we'd need to merge also #1400 to fix this.

@maresb maresb reopened this May 19, 2025
@ricardoV94
Copy link
Member

Reopening because I think we'd need to merge also #1400 to fix this.

It's merged

@maresb
Copy link
Contributor

maresb commented May 20, 2025

@mvds314 could you please try again with the version we just released?

@mvds314
Copy link
Author

mvds314 commented May 28, 2025

I ran the example code again with gcc 15.1 and pytensor 2.31.2. Unfortunately, the issue persists.

@maresb
Copy link
Contributor

maresb commented May 28, 2025

I ran the example code again with gcc 15.1 and pytensor 2.31.2. Unfortunately, the issue persists.

Thanks for getting back to me @mvds314! That's disappointing that the issue persists.

Would you be able to repeat the procedure from #1401 (comment) and see if the error message there is the same or different? Thanks!

@mvds314
Copy link
Author

mvds314 commented May 29, 2025

Sometimes, I get the message in a popup window, and I don't have to run the g++ command separately.
This appears right away on running the script in a popup windows. It's almost the same as the other one.

The procedure entry point clock_gettime64 could not be located in the dynamic link library

C:\....\msys64\mingw64\bin\...\lib\gcc\x86_64-w64-mingw32\15.1.0\cc1.exe

Then, after a couple of minutes, just before I get a huge error in the terminal I get another one popup which is the same as before

The procedure entry point clock_gettime64 could not be located in the dynamic link library

C:\....\msys64\mingw64\bin\...\lib\gcc\x86_64-w64-mingw64\15.1.0\cc1plus.exe

Note the differences between the two messages: mingw32 vs ming64 and cc1 vs cc1plus.

@ricardoV94
Copy link
Member

Try to clear the pytensor cache just in case that's the problem. Run pytensor-cache purge on a terminal where the relevant python environment is activated.

@mvds314
Copy link
Author

mvds314 commented May 30, 2025

Unfortunately, purging the cache does not help. The error remains the same.

@ricardoV94
Copy link
Member

ricardoV94 commented May 30, 2025

Just to double check, you get a dump core running the example in the first message, in a fresh new conda environment with pytensor==2.31.2 installed from conda-forge channel?

Are you using regular python interpreter or ipython/jupyter?

Can you print pytensor.__version__, pytensor.__path__?

@mvds314
Copy link
Author

mvds314 commented May 30, 2025

When I install pytensor with conda it works, that's probably because conda uses gcc 13.3. Problems arise with gcc 15 and higher.

I am using winpython with Python version 3.13 and installed pytensor with pip.

pytensor.__version__ gives 2.31.2, and pytensor.__path__ gives C:\..\WPy64-31330\envs\MYENV\Lib\site-packages\pytensor'

@mvds314
Copy link
Author

mvds314 commented May 30, 2025

Also, I checked whether I could find the symbol clock_gettime64 in any libgcc_s_seh-1.dll I could find on my system (there are quite a few, as they get shipped with all kinds of programs). For this I used: nm -g <path to dll>.

There are no matches (none in any of the gcc versions 13, 14, 15 which are anywhere on my system). Which is to be expected, because, as I understand it, clock_gettime64 is Linux only.

My conclusion is that, somehow, pytensor starts using this symbol when it compiles stuff with gcc version 15 and above.

@ricardoV94
Copy link
Member

Not sure whether winpython introduces some sort of incompatibility, but it's odd you are seeing that error since we removed any mention of clock_gettime64 from the PyTensor codebase: #1403

Hence my suspicion about the cache. Can you try to manually delete the contents of pytensor.config.compiledir in case pytensor-cache purge is not working correctly on your end.

Do you see any info on the error message as to what file/line clock_gettime64 is coming from?

@mvds314
Copy link
Author

mvds314 commented May 30, 2025

I manually cleaned the folder, but the problem remains.
I don't think it is caused by explicit mentions of clock_gettime64, as I would that mean gcc14 would fail as well. My guess is that, somehow, gcc15 decides that clock_gettime64 is required implicitly because of other functionality used.

Any, within the compiledir, there is a mod.cpp with the following contents. Is this of any use?

#include "pytensor_mod_helper.h"
#include "structmember.h"
#include <Python.h>
#include <sys/time.h>

#if PY_VERSION_HEX >= 0x03000000
#include "numpy/npy_3kcompat.h"
#endif

#ifndef Py_TYPE
#define Py_TYPE(obj) obj->ob_type
#endif

/**

TODO:
- Check max supported depth of recursion
- CLazyLinker should add context information to errors caught during evaluation.
Say what node we were on, add the traceback attached to the node.
- Clear containers of fully-useed intermediate results if allow_gc is 1
- Add timers for profiling
- Add support for profiling space used.


  */
static double pytime(const struct timeval *tv) {
  struct timeval t;
  if (!tv) {
    tv = &t;
    gettimeofday(&t, NULL);
  }
  return (double)tv->tv_sec + (double)tv->tv_usec / 1000000.0;
}

/**
  Helper routine to convert a PyList of integers to a c array of integers.
  */
static int unpack_list_of_ssize_t(PyObject *pylist, Py_ssize_t **dst,
                                  Py_ssize_t *len, const char *kwname) {
  Py_ssize_t buflen, *buf;
  if (!PyList_Check(pylist)) {
    PyErr_Format(PyExc_TypeError, "%s must be list", kwname);
    return -1;
  }
  assert(NULL == *dst);
  *len = buflen = PyList_Size(pylist);
  *dst = buf = (Py_ssize_t *)calloc(buflen, sizeof(Py_ssize_t));
  assert(buf);
  for (int ii = 0; ii < buflen; ++ii) {
    PyObject *el_i = PyList_GetItem(pylist, ii);
    Py_ssize_t n_i = PyNumber_AsSsize_t(el_i, PyExc_IndexError);
    if (PyErr_Occurred()) {
      free(buf);
      *dst = NULL;
      return -1;
    }
    buf[ii] = n_i;
  }
  return 0;
}

static int unpack_nested_tuples_of_ssize_t(PyObject *pytuple, Py_ssize_t **dst,
                                           Py_ssize_t *len,
                                           const char *kwname) {
  Py_ssize_t buflen, *buf;
  if (!PyTuple_Check(pytuple)) {
    PyErr_Format(PyExc_TypeError, "%s must be a tuple of tuples", kwname);
    return -1;
  }
  *len = buflen = PyTuple_Size(pytuple);
  *dst = buf = (Py_ssize_t *) malloc(sizeof(Py_ssize_t *) * buflen * 2);
  assert(buf);
  for (int ii = 0; ii < buflen; ++ii) {
    PyObject *el_i = PyTuple_GetItem(pytuple, ii);

    PyObject *el_i_1 = PyTuple_GetItem(el_i, 0);
    PyObject *el_i_2 = PyTuple_GetItem(el_i, 1);

    Py_ssize_t n_1 = PyNumber_AsSsize_t(el_i_1, PyExc_IndexError);
    Py_ssize_t n_2 = PyNumber_AsSsize_t(el_i_2, PyExc_IndexError);

    if (PyErr_Occurred()) {
      free(buf);
      dst = NULL;
      return -1;
    }

    buf[ii * 2 + 0] = n_1;
    buf[ii * 2 + 1] = n_2;
  }
  return 0;
}

/**

  CLazyLinker


  */
typedef struct {
  PyObject_HEAD
      /* Type-specific fields go here. */
      PyObject *nodes;      // the python list of nodes
  PyObject *thunks;         // python list of thunks
  PyObject *pre_call_clear; // list of cells to clear on call.
  int allow_gc;
  Py_ssize_t n_applies;
  int n_vars;        // number of variables in the graph
  int *var_computed; // 1 or 0 for every variable
  PyObject **var_computed_cells;
  PyObject **var_value_cells;
  Py_ssize_t **dependencies; // list of vars dependencies for GC
  Py_ssize_t *n_dependencies;

  Py_ssize_t n_output_vars;
  Py_ssize_t *output_vars; // variables that *must* be evaluated by call

  int *is_lazy; // 1 or 0 for every thunk

  Py_ssize_t *var_owner; // nodes[[var_owner[var_idx]]] is var[var_idx]->owner
  int *var_has_owner;    //  1 or 0

  Py_ssize_t *node_n_inputs;
  Py_ssize_t *node_n_outputs;
  Py_ssize_t **node_inputs;
  Py_ssize_t **node_outputs;
  Py_ssize_t
      *node_inputs_outputs_base; // node_inputs and node_outputs point into this
  Py_ssize_t *node_n_prereqs;
  Py_ssize_t **node_prereqs;

  Py_ssize_t *update_storage; // (input_idx, output_idx) pairs specifying
                              // output-to-input updates
  Py_ssize_t n_updates;

  void **thunk_cptr_fn;
  void **thunk_cptr_data;
  PyObject *call_times;
  PyObject *call_counts;
  int do_timing;
  int need_update_inputs;
  int position_of_error; // -1 for no error, otw the index into `thunks` that
                         // failed.
} CLazyLinker;

static void CLazyLinker_dealloc(PyObject *_self) {
  CLazyLinker *self = (CLazyLinker *)_self;
  free(self->thunk_cptr_fn);
  free(self->thunk_cptr_data);

  free(self->is_lazy);

  free(self->update_storage);

  if (self->node_n_prereqs) {
    for (int i = 0; i < self->n_applies; ++i) {
      free(self->node_prereqs[i]);
    }
  }
  free(self->node_n_prereqs);
  free(self->node_prereqs);
  free(self->node_inputs_outputs_base);
  free(self->node_n_inputs);
  free(self->node_n_outputs);
  free(self->node_inputs);
  free(self->node_outputs);

  if (self->dependencies) {
    for (int i = 0; i < self->n_vars; ++i) {
      free(self->dependencies[i]);
    }
    free(self->dependencies);
    free(self->n_dependencies);
  }

  free(self->var_owner);
  free(self->var_has_owner);
  free(self->var_computed);
  if (self->var_computed_cells) {
    for (int i = 0; i < self->n_vars; ++i) {
      Py_DECREF(self->var_computed_cells[i]);
      Py_DECREF(self->var_value_cells[i]);
    }
  }
  free(self->var_computed_cells);
  free(self->var_value_cells);
  free(self->output_vars);

  Py_XDECREF(self->nodes);
  Py_XDECREF(self->thunks);
  Py_XDECREF(self->call_times);
  Py_XDECREF(self->call_counts);
  Py_XDECREF(self->pre_call_clear);
  Py_TYPE(self)->tp_free((PyObject *)self);
}
static PyObject *CLazyLinker_new(PyTypeObject *type, PyObject *args,
                                 PyObject *kwds) {
  CLazyLinker *self;

  self = (CLazyLinker *)type->tp_alloc(type, 0);
  if (self != NULL) {
    self->nodes = NULL;
    self->thunks = NULL;
    self->pre_call_clear = NULL;

    self->allow_gc = 1;
    self->n_applies = 0;
    self->n_vars = 0;
    self->var_computed = NULL;
    self->var_computed_cells = NULL;
    self->var_value_cells = NULL;
    self->dependencies = NULL;
    self->n_dependencies = NULL;

    self->n_output_vars = 0;
    self->output_vars = NULL;

    self->is_lazy = NULL;

    self->var_owner = NULL;
    self->var_has_owner = NULL;

    self->node_n_inputs = NULL;
    self->node_n_outputs = NULL;
    self->node_inputs = NULL;
    self->node_outputs = NULL;
    self->node_inputs_outputs_base = NULL;
    self->node_prereqs = NULL;
    self->node_n_prereqs = NULL;

    self->update_storage = NULL;
    self->n_updates = 0;

    self->thunk_cptr_data = NULL;
    self->thunk_cptr_fn = NULL;
    self->call_times = NULL;
    self->call_counts = NULL;
    self->do_timing = 0;

    self->need_update_inputs = 0;
    self->position_of_error = -1;
  }
  return (PyObject *)self;
}

static int CLazyLinker_init(CLazyLinker *self, PyObject *args, PyObject *kwds) {
  static char *kwlist[] = {(char *)"nodes",
                           (char *)"thunks",
                           (char *)"pre_call_clear",
                           (char *)"allow_gc",
                           (char *)"call_counts",
                           (char *)"call_times",
                           (char *)"compute_map_list",
                           (char *)"storage_map_list",
                           (char *)"base_input_output_list",
                           (char *)"node_n_inputs",
                           (char *)"node_n_outputs",
                           (char *)"node_input_offset",
                           (char *)"node_output_offset",
                           (char *)"var_owner",
                           (char *)"is_lazy_list",
                           (char *)"output_vars",
                           (char *)"node_prereqs",
                           (char *)"node_output_size",
                           (char *)"update_storage",
                           (char *)"dependencies",
                           NULL};

  PyObject *compute_map_list = NULL, *storage_map_list = NULL,
           *base_input_output_list = NULL, *node_n_inputs = NULL,
           *node_n_outputs = NULL, *node_input_offset = NULL,
           *node_output_offset = NULL, *var_owner = NULL, *is_lazy = NULL,
           *output_vars = NULL, *node_prereqs = NULL, *node_output_size = NULL,
    *dependencies = NULL, *update_storage=NULL;

  assert(!self->nodes);
  if (!PyArg_ParseTupleAndKeywords(
          args, kwds, "OOOiOOOOOOOOOOOOOOOO", kwlist, &self->nodes,
          &self->thunks, &self->pre_call_clear, &self->allow_gc,
          &self->call_counts, &self->call_times, &compute_map_list,
          &storage_map_list, &base_input_output_list, &node_n_inputs,
          &node_n_outputs, &node_input_offset, &node_output_offset, &var_owner,
          &is_lazy, &output_vars, &node_prereqs, &node_output_size,
          &update_storage, &dependencies))
    return -1;
  Py_INCREF(self->nodes);
  Py_INCREF(self->thunks);
  Py_INCREF(self->pre_call_clear);
  Py_INCREF(self->call_counts);
  Py_INCREF(self->call_times);

  Py_ssize_t n_applies = PyList_Size(self->nodes);

  self->n_applies = n_applies;
  self->n_vars = PyList_Size(var_owner);

  if (PyList_Size(self->thunks) != n_applies)
    return -1;
  if (PyList_Size(self->call_counts) != n_applies)
    return -1;
  if (PyList_Size(self->call_times) != n_applies)
    return -1;

  // allocated and initialize thunk_cptr_data and thunk_cptr_fn
  if (n_applies) {
    self->thunk_cptr_data = (void **)calloc(n_applies, sizeof(void *));
    self->thunk_cptr_fn = (void **)calloc(n_applies, sizeof(void *));
    self->is_lazy = (int *)calloc(n_applies, sizeof(int));
    self->node_prereqs = (Py_ssize_t **)calloc(n_applies, sizeof(Py_ssize_t *));
    self->node_n_prereqs = (Py_ssize_t *)calloc(n_applies, sizeof(Py_ssize_t));
    assert(self->node_prereqs);
    assert(self->node_n_prereqs);
    assert(self->is_lazy);
    assert(self->thunk_cptr_fn);
    assert(self->thunk_cptr_data);

    for (int i = 0; i < n_applies; ++i) {
      PyObject *thunk = PyList_GetItem(self->thunks, i);
      // thunk is borrowed
      if (PyObject_HasAttrString(thunk, "cthunk")) {
        PyObject *cthunk = PyObject_GetAttrString(thunk, "cthunk");
        // new reference
        assert(cthunk && NpyCapsule_Check(cthunk));
        self->thunk_cptr_fn[i] = NpyCapsule_AsVoidPtr(cthunk);
        self->thunk_cptr_data[i] = NpyCapsule_GetDesc(cthunk);
        Py_DECREF(cthunk);
        // cthunk is kept alive by membership in self->thunks
      }

      PyObject *el_i = PyList_GetItem(is_lazy, i);
      self->is_lazy[i] = PyNumber_AsSsize_t(el_i, NULL);

      /* now get the prereqs */
      el_i = PyList_GetItem(node_prereqs, i);
      assert(PyList_Check(el_i));
      self->node_n_prereqs[i] = PyList_Size(el_i);
      if (self->node_n_prereqs[i]) {
        self->node_prereqs[i] =
            (Py_ssize_t *)malloc(PyList_Size(el_i) * sizeof(Py_ssize_t));
        for (int j = 0; j < PyList_Size(el_i); ++j) {
          PyObject *el_ij = PyList_GetItem(el_i, j);
          Py_ssize_t N = PyNumber_AsSsize_t(el_ij, PyExc_IndexError);
          if (PyErr_Occurred())
            return -1;
          // N < n. variables
          assert(N < PyList_Size(var_owner));
          self->node_prereqs[i][j] = N;
        }
      }
    }
  }
  if (PyList_Check(base_input_output_list)) {
    Py_ssize_t n_inputs_outputs_base = PyList_Size(base_input_output_list);
    self->node_inputs_outputs_base =
        (Py_ssize_t *)calloc(n_inputs_outputs_base, sizeof(Py_ssize_t));
    assert(self->node_inputs_outputs_base);
    for (int i = 0; i < n_inputs_outputs_base; ++i) {
      PyObject *el_i = PyList_GetItem(base_input_output_list, i);
      Py_ssize_t idx = PyNumber_AsSsize_t(el_i, PyExc_IndexError);
      if (PyErr_Occurred())
        return -1;
      self->node_inputs_outputs_base[i] = idx;
    }
    self->node_n_inputs = (Py_ssize_t *)calloc(n_applies, sizeof(Py_ssize_t));
    assert(self->node_n_inputs);
    self->node_n_outputs = (Py_ssize_t *)calloc(n_applies, sizeof(Py_ssize_t));
    assert(self->node_n_outputs);
    self->node_inputs = (Py_ssize_t **)calloc(n_applies, sizeof(Py_ssize_t *));
    assert(self->node_inputs);
    self->node_outputs = (Py_ssize_t **)calloc(n_applies, sizeof(Py_ssize_t *));
    assert(self->node_outputs);
    for (int i = 0; i < n_applies; ++i) {
      Py_ssize_t N;
      N = PyNumber_AsSsize_t(PyList_GetItem(node_n_inputs, i),
                             PyExc_IndexError);
      if (PyErr_Occurred())
        return -1;
      assert(N <= n_inputs_outputs_base);
      self->node_n_inputs[i] = N;
      N = PyNumber_AsSsize_t(PyList_GetItem(node_n_outputs, i),
                             PyExc_IndexError);
      if (PyErr_Occurred())
        return -1;
      assert(N <= n_inputs_outputs_base);
      self->node_n_outputs[i] = N;
      N = PyNumber_AsSsize_t(PyList_GetItem(node_input_offset, i),
                             PyExc_IndexError);
      if (PyErr_Occurred())
        return -1;
      assert(N <= n_inputs_outputs_base);
      self->node_inputs[i] = &self->node_inputs_outputs_base[N];
      N = PyNumber_AsSsize_t(PyList_GetItem(node_output_offset, i),
                             PyExc_IndexError);
      if (PyErr_Occurred())
        return -1;
      assert(N <= n_inputs_outputs_base);
      self->node_outputs[i] = &self->node_inputs_outputs_base[N];
    }
  } else {
    PyErr_SetString(PyExc_TypeError, "base_input_output_list must be list");
    return -1;
  }

  // allocation for var_owner
  if (PyList_Check(var_owner)) {
    self->var_owner = (Py_ssize_t *)calloc(self->n_vars, sizeof(Py_ssize_t));
    self->var_has_owner = (int *)calloc(self->n_vars, sizeof(int));
    self->var_computed = (int *)calloc(self->n_vars, sizeof(int));
    self->var_computed_cells =
        (PyObject **)calloc(self->n_vars, sizeof(PyObject *));
    self->var_value_cells =
        (PyObject **)calloc(self->n_vars, sizeof(PyObject *));
    for (int i = 0; i < self->n_vars; ++i) {
      PyObject *el_i = PyList_GetItem(var_owner, i);
      if (el_i == Py_None) {
        self->var_has_owner[i] = 0;
      } else {
        Py_ssize_t N = PyNumber_AsSsize_t(el_i, PyExc_IndexError);
        if (PyErr_Occurred())
          return -1;
        assert(N <= n_applies);
        self->var_owner[i] = N;
        self->var_has_owner[i] = 1;
      }
      self->var_computed_cells[i] = PyList_GetItem(compute_map_list, i);
      Py_INCREF(self->var_computed_cells[i]);
      self->var_value_cells[i] = PyList_GetItem(storage_map_list, i);
      Py_INCREF(self->var_value_cells[i]);
    }
  } else {
    PyErr_SetString(PyExc_TypeError, "var_owner must be list");
    return -1;
  }

  if (dependencies != Py_None) {
    self->dependencies =
        (Py_ssize_t **)calloc(self->n_vars, sizeof(Py_ssize_t *));
    self->n_dependencies =
        (Py_ssize_t *)calloc(self->n_vars, sizeof(Py_ssize_t));
    assert(self->dependencies);
    assert(self->n_dependencies);

    for (int i = 0; i < self->n_vars; ++i) {
      PyObject *tmp = PyList_GetItem(dependencies, i);
      // refcounting - tmp is borrowed
      if (unpack_list_of_ssize_t(tmp, &self->dependencies[i],
                                 &self->n_dependencies[i], "dependencies"))
        return -1;
    }
  }

  if (unpack_list_of_ssize_t(output_vars, &self->output_vars,
                             &self->n_output_vars, "output_vars"))
    return -1;

  for (int i = 0; i < self->n_output_vars; ++i) {
    assert(self->output_vars[i] < self->n_vars);
  }

  if (unpack_nested_tuples_of_ssize_t(update_storage, &self->update_storage,
                                      &self->n_updates, "updates_storage"))
    return -1;

  return 0;
}
static void set_position_of_error(CLazyLinker *self, int owner_idx) {
  if (self->position_of_error == -1) {
    self->position_of_error = owner_idx;
  }
}
static PyObject *pycall(CLazyLinker *self, Py_ssize_t node_idx, int verbose) {
  // call thunk to see which inputs it wants
  PyObject *thunk = PyList_GetItem(self->thunks, node_idx);
  // refcounting - thunk is borrowed
  PyObject *rval = NULL;
  if (self->do_timing) {
    double t0 = pytime(NULL);
    if (verbose)
      fprintf(stderr, "calling via Python (node %i)\n", (int)node_idx);
    rval = PyObject_CallNoArgs(thunk);
    if (rval) {
      double t1 = pytime(NULL);
      double ti = PyFloat_AsDouble(PyList_GetItem(self->call_times, node_idx));
      PyList_SetItem(self->call_times, node_idx,
                     PyFloat_FromDouble(t1 - t0 + ti));
      PyObject *count = PyList_GetItem(self->call_counts, node_idx);
      long icount = PyLong_AsLong(count);
      PyList_SetItem(self->call_counts, node_idx, PyLong_FromLong(icount + 1));
    }
  } else {
    if (verbose) {
      fprintf(stderr, "calling via Python (node %i)\n", (int)node_idx);
    }
    rval = PyObject_CallNoArgs(thunk);
  }
  return rval;
}
static int c_call(CLazyLinker *self, Py_ssize_t node_idx, int verbose) {
  void *ptr_addr = self->thunk_cptr_fn[node_idx];
  int (*fn)(void *) = (int (*)(void *))(ptr_addr);
  if (verbose)
    fprintf(stderr, "calling non-lazy shortcut (node %i)\n", (int)node_idx);
  int err = 0;
  if (self->do_timing) {
    double t0 = pytime(NULL);
    err = fn(self->thunk_cptr_data[node_idx]);
    double t1 = pytime(NULL);
    double ti = PyFloat_AsDouble(PyList_GetItem(self->call_times, node_idx));
    PyList_SetItem(self->call_times, node_idx,
                   PyFloat_FromDouble(t1 - t0 + ti));
    PyObject *count = PyList_GetItem(self->call_counts, node_idx);
    long icount = PyLong_AsLong(count);
    PyList_SetItem(self->call_counts, node_idx, PyLong_FromLong(icount + 1));
  } else {
    err = fn(self->thunk_cptr_data[node_idx]);
  }

  if (err) {
    // cast the argument to a PyList (as described near line 226 of cc.py)
    PyObject *__ERROR = ((PyObject **)self->thunk_cptr_data[node_idx])[0];
    assert(PyList_Check(__ERROR));
    assert(PyList_Size(__ERROR) == 3);
    PyObject *err_type = PyList_GetItem(__ERROR, 0);  // stolen ref
    PyObject *err_msg = PyList_GetItem(__ERROR, 1);   // stolen ref
    PyObject *err_trace = PyList_GetItem(__ERROR, 2); // stolen ref
    PyList_SET_ITEM(__ERROR, 0, Py_None);
    Py_INCREF(Py_None); // clobbers old ref
    PyList_SET_ITEM(__ERROR, 1, Py_None);
    Py_INCREF(Py_None); // clobbers old ref
    PyList_SET_ITEM(__ERROR, 2, Py_None);
    Py_INCREF(Py_None); // clobbers old ref

    assert(!PyErr_Occurred()); // because CLinker hid the exception in __ERROR
                               // aka data
    PyErr_Restore(err_type, err_msg, err_trace); // steals refs to args
  }
  if (err)
    set_position_of_error(self, node_idx);
  return err;
}
static int lazy_rec_eval(CLazyLinker *self, Py_ssize_t var_idx, PyObject *one,
                         PyObject *zero) {
  PyObject *rval = NULL;
  int verbose = 0;
  int err = 0;

  if (verbose)
    fprintf(stderr, "lazy_rec computing %i\n", (int)var_idx);

  if (self->var_computed[var_idx] || !self->var_has_owner[var_idx])
    return 0;

  Py_ssize_t owner_idx = self->var_owner[var_idx];

  // STEP 1: compute the pre-requirements of the node
  // Includes input nodes for non-lazy ops.
  for (int i = 0; i < self->node_n_prereqs[owner_idx]; ++i) {
    Py_ssize_t prereq_idx = self->node_prereqs[owner_idx][i];
    if (!self->var_computed[prereq_idx]) {
      err = lazy_rec_eval(self, prereq_idx, one, zero);
      if (err)
        return err;
    }
    assert(self->var_computed[prereq_idx]);
  }

  // STEP 2: compute the node itself
  if (self->is_lazy[owner_idx]) {
    // update the compute_map cells corresponding to the inputs of this thunk
    for (int i = 0; i < self->node_n_inputs[owner_idx]; ++i) {
      int in_idx = self->node_inputs[owner_idx][i];
      if (self->var_computed[in_idx]) {
        Py_INCREF(one);
        err = PyList_SetItem(self->var_computed_cells[in_idx], 0, one);
      } else {
        Py_INCREF(zero);
        err = PyList_SetItem(self->var_computed_cells[in_idx], 0, zero);
      }
      if (err)
        goto fail;
    }

    rval = pycall(self, owner_idx, verbose);
    // refcounting - rval is new ref
    // TODO: to prevent infinite loops
    // - consider check that a thunk does not ask for an input that is already
    // computed
    if (rval == NULL) {
      assert(PyErr_Occurred());
      err = 1;
      goto fail;
    }

    // update the computed-ness of any output cells
    for (int i = 0; i < self->node_n_outputs[owner_idx]; ++i) {
      int out_idx = self->node_outputs[owner_idx][i];
      PyObject *el_i = PyList_GetItem(self->var_computed_cells[out_idx], 0);
      Py_ssize_t N = PyNumber_AsSsize_t(el_i, PyExc_IndexError);
      if (PyErr_Occurred()) {
        err = -1;
        goto pyfail;
      }
      assert(N == 0 || N == 1);
      self->var_computed[out_idx] = N;
    }
    if (!self->var_computed[var_idx]) {
      /*
       * If self is not computed after the call, this means that some
       * inputs are needed.  Compute the ones on the returned list
       * and try to compute the current node again (with recursive call).
       * This allows a node to request more nodes more than once before
       * finally yielding a result.
       */
      if (!PyList_Check(rval)) {
        // TODO: More helpful error to help find *which node* made this
        // bad thunk
        PyErr_SetString(PyExc_TypeError, "lazy thunk should return a list");
        err = 1;
        goto pyfail;
      }

      if (!PyList_Size(rval)) {
        PyErr_SetString(
            PyExc_ValueError,
            "lazy thunk returned empty list without computing output");
        err = 1;
        goto pyfail;
      }

      for (int i = 0; i < PyList_Size(rval); ++i) {
        PyObject *el_i = PyList_GetItem(rval, i);
        Py_ssize_t N = PyNumber_AsSsize_t(el_i, PyExc_IndexError);
        if (PyErr_Occurred()) {
          err = 1;
          goto pyfail;
        }
        assert(N <= self->node_n_inputs[owner_idx]);
        Py_ssize_t input_idx = self->node_inputs[owner_idx][N];
        err = lazy_rec_eval(self, input_idx, one, zero);
        if (err)
          goto pyfail;
      }

      Py_DECREF(rval);
      /*
       * We intentionally skip all the end-of-function processing
       * (mark outputs, GC) as it will be performed by the call
       * that actually manages to compute the result.
       */
      return lazy_rec_eval(self, var_idx, one, zero);
    }

    Py_DECREF(rval);
  } else // owner is not a lazy op. Ensure all inputs are evaluated.
  {
    // loop over inputs to owner
    // call lazy_rec_eval on each one that is not computed.
    // if there's an error, pass it up the stack
    for (int i = 0; i < self->node_n_inputs[owner_idx]; ++i) {
      Py_ssize_t input_idx = self->node_inputs[owner_idx][i];
      if (!self->var_computed[input_idx]) {
        err = lazy_rec_eval(self, input_idx, one, zero);
        if (err)
          return err;
      }
      assert(self->var_computed[input_idx]);
    }

    // call the thunk for this owner.
    if (self->thunk_cptr_fn[owner_idx]) {
      err = c_call(self, owner_idx, verbose);
      if (err)
        goto fail;
    } else {
      rval = pycall(self, owner_idx, verbose);
      // rval is new ref
      if (rval) // pycall returned normally (no exception)
      {
        if (rval == Py_None) {
          Py_DECREF(rval); // ignore a return of None
        } else if (PyList_Check(rval)) {
          PyErr_SetString(PyExc_TypeError,
                          "non-lazy thunk should return None, not list");
          err = 1;
          goto pyfail;
        } else // don't know what it returned, but it wasn't right.
        {
          PyErr_SetObject(PyExc_TypeError, rval);
          err = 1;
          // We don't release rval since we put it in the error above
          goto fail;
        }
      } else // pycall returned NULL (internal error)
      {
        err = 1;
        goto fail;
      }
    }
  }

  // loop over all outputs and mark them as computed
  for (int i = 0; i < self->node_n_outputs[owner_idx]; ++i) {
    self->var_computed[self->node_outputs[owner_idx][i]] = 1;
  }

  // Free vars that are not needed anymore
  if (self->allow_gc) {
    for (int i = 0; i < self->node_n_inputs[owner_idx]; ++i) {
      int cleanup = 1;
      Py_ssize_t i_idx = self->node_inputs[owner_idx][i];
      if (!self->var_has_owner[i_idx])
        continue;

      for (int j = 0; j < self->n_output_vars; ++j) {
        if (i_idx == self->output_vars[j]) {
          cleanup = 0;
          break;
        }
      }
      if (!cleanup)
        continue;

      for (int j = 0; j < self->n_dependencies[i_idx]; ++j) {
        if (!self->var_computed[self->dependencies[i_idx][j]]) {
          cleanup = 0;
          break;
        }
      }
      if (!cleanup)
        continue;

      Py_INCREF(Py_None);
      err = PyList_SetItem(self->var_value_cells[i_idx], 0, Py_None);
      // See the Stack gc implementation for why we change it to 2 and not 0.
      self->var_computed[i_idx] = 2;
      if (err)
        goto fail;
    }
  }

  return 0;
pyfail:
  Py_DECREF(rval);
fail:
  set_position_of_error(self, owner_idx);
  return err;
}

static PyObject *CLazyLinker_call(PyObject *_self, PyObject *args,
                                  PyObject *kwds) {
  CLazyLinker *self = (CLazyLinker *)_self;
  static char *kwlist[] = {(char *)"time_thunks", (char *)"n_calls",
                           (char *)"output_subset", NULL};
  int n_calls = 1;
  PyObject *output_subset_ptr = NULL;
  if (!PyArg_ParseTupleAndKeywords(args, kwds, "|iiO", kwlist, &self->do_timing,
                                   &n_calls, &output_subset_ptr))
    return NULL;

  int err = 0;
  // parse an output_subset list
  // it is stored as a bool list of length n_output_vars: calculate a var or not
  char *output_subset = NULL;
  int output_subset_size = -1;
  if (output_subset_ptr != NULL) {
    if (!PyList_Check(output_subset_ptr)) {
      err = 1;
      PyErr_SetString(PyExc_RuntimeError, "Output_subset is not a list");
    } else {
      output_subset_size = PyList_Size(output_subset_ptr);
      output_subset = (char *)calloc(self->n_output_vars, sizeof(char));
      for (int it = 0; it < output_subset_size; ++it) {
        PyObject *elem = PyList_GetItem(output_subset_ptr, it);
        if (!PyLong_Check(elem)) {
          err = 1;
          PyErr_SetString(PyExc_RuntimeError,
                          "Some elements of output_subset list are not int");
        }
        output_subset[PyLong_AsLong(elem)] = 1;
      }
    }
  }

  self->position_of_error = -1;
  // create constants used to fill the var_compute_cells
  PyObject *one = PyLong_FromLong(1);
  PyObject *zero = PyLong_FromLong(0);

  // pre-allocate our return value
  Py_INCREF(Py_None);
  PyObject *rval = Py_None;
  // clear storage of pre_call_clear elements
  for (int call_i = 0; call_i < n_calls && (!err); ++call_i) {
    Py_ssize_t n_pre_call_clear = PyList_Size(self->pre_call_clear);
    assert(PyList_Check(self->pre_call_clear));
    for (int i = 0; i < n_pre_call_clear; ++i) {
      PyObject *el_i = PyList_GetItem(self->pre_call_clear, i);
      Py_INCREF(Py_None);
      PyList_SetItem(el_i, 0, Py_None);
    }
    // clear the computed flag out of all non-input vars
    for (int i = 0; i < self->n_vars; ++i) {
      self->var_computed[i] = !self->var_has_owner[i];
      if (self->var_computed[i]) {
        Py_INCREF(one);
        PyList_SetItem(self->var_computed_cells[i], 0, one);
      } else {
        Py_INCREF(zero);
        PyList_SetItem(self->var_computed_cells[i], 0, zero);
      }
    }

    int first_updated = self->n_output_vars - self->n_updates;
    for (int i = 0; i < self->n_output_vars && (!err); ++i) {
      if (i >= first_updated || output_subset == NULL ||
          output_subset[i] == 1) {
        err = lazy_rec_eval(self, self->output_vars[i], one, zero);
      }
    }

    if (!err) {
      // save references to outputs prior to updating storage containers
      assert(self->n_output_vars >= self->n_updates);
      Py_DECREF(rval);
      rval = PyList_New(self->n_output_vars);
      for (int i = 0; i < (self->n_output_vars); ++i) {
        Py_ssize_t src = self->output_vars[i];
        PyObject *item = PyList_GetItem(self->var_value_cells[src], 0);
        if ((output_subset == NULL || output_subset[i]) &&
            self->var_computed[src] != 1) {
          err = 1;
          PyErr_Format(PyExc_AssertionError,
                       "The compute map of output %d should contain "
                       "1 at the end of execution, not %d.",
                       i, self->var_computed[src]);
          break;
        }
        Py_INCREF(item);
        PyList_SetItem(rval, i, item);
      }
    }

    if (!err) {
      // Update the inputs that have an update rule
      for (int i = 0; i < self->n_updates; ++i) {
        Py_ssize_t in_idx = self->update_storage[i * 2 + 0];
        Py_ssize_t out_idx = self->update_storage[i * 2 + 1];
        PyObject *tmp = PyList_GetItem(rval, out_idx);
        Py_INCREF(tmp);
        PyList_SetItem(self->var_value_cells[in_idx], 0, tmp);
      }
    }
  }

  /*
    Clear everything that is left and not an output. This is needed
    for lazy evaluation since the current GC algo is too conservative
    with lazy graphs.
  */
  if (self->allow_gc && !err) {
    for (Py_ssize_t i = 0; i < self->n_vars; ++i) {
      int do_cleanup = 1;
      if (!self->var_has_owner[i] || !self->var_computed[i])
        continue;
      for (int j = 0; j < self->n_output_vars; ++j) {
        if (i == self->output_vars[j]) {
          do_cleanup = 0;
          break;
        }
      }
      if (!do_cleanup)
        continue;
      Py_INCREF(Py_None);
      PyList_SetItem(self->var_value_cells[i], 0, Py_None);
    }
  }
  if (output_subset != NULL)
    free(output_subset);

  Py_DECREF(one);
  Py_DECREF(zero);
  if (err) {
    Py_DECREF(rval);
    return NULL;
  }
  return rval;
}

#if 0
static PyMethodDef CLazyLinker_methods[] = {
    {
      //"name", (PyCFunction)CLazyLinker_accept, METH_VARARGS, "Return the name, combining the first and last name"
    },
    {NULL}  /* Sentinel */
};
#endif

static PyObject *CLazyLinker_get_allow_gc(CLazyLinker *self, void *closure) {
  return PyBool_FromLong(self->allow_gc);
}

static int CLazyLinker_set_allow_gc(CLazyLinker *self, PyObject *value,
                                    void *closure) {
  if (!PyBool_Check(value))
    return -1;

  if (value == Py_True)
    self->allow_gc = true;
  else
    self->allow_gc = false;
  return 0;
}

static PyGetSetDef CLazyLinker_getset[] = {
    {(char *)"allow_gc", (getter)CLazyLinker_get_allow_gc,
     (setter)CLazyLinker_set_allow_gc,
     (char *)"do this function support allow_gc", NULL},
    {NULL, NULL, NULL, NULL} /* Sentinel */
};
static PyMemberDef CLazyLinker_members[] = {
    {(char *)"nodes", T_OBJECT_EX, offsetof(CLazyLinker, nodes), 0,
     (char *)"list of nodes"},
    {(char *)"thunks", T_OBJECT_EX, offsetof(CLazyLinker, thunks), 0,
     (char *)"list of thunks in program"},
    {(char *)"call_counts", T_OBJECT_EX, offsetof(CLazyLinker, call_counts), 0,
     (char *)"number of calls of each thunk"},
    {(char *)"call_times", T_OBJECT_EX, offsetof(CLazyLinker, call_times), 0,
     (char *)"total runtime in each thunk"},
    {(char *)"position_of_error", T_INT,
     offsetof(CLazyLinker, position_of_error), 0,
     (char *)"position of failed thunk"},
    {(char *)"time_thunks", T_INT, offsetof(CLazyLinker, do_timing), 0,
     (char *)"bool: nonzero means call will time thunks"},
    {(char *)"need_update_inputs", T_INT,
     offsetof(CLazyLinker, need_update_inputs), 0,
     (char *)"bool: nonzero means Function.__call__ must implement update "
             "mechanism"},
    {NULL} /* Sentinel */
};

static PyTypeObject lazylinker_ext_CLazyLinkerType = {
    PyVarObject_HEAD_INIT(NULL, 0)

        "lazylinker_ext.CLazyLinker",         /*tp_name*/
    sizeof(CLazyLinker),                      /*tp_basicsize*/
    0,                                        /*tp_itemsize*/
    CLazyLinker_dealloc,                      /*tp_dealloc*/
    0,                                        /*tp_print*/
    0,                                        /*tp_getattr*/
    0,                                        /*tp_setattr*/
    0,                                        /*tp_compare*/
    0,                                        /*tp_repr*/
    0,                                        /*tp_as_number*/
    0,                                        /*tp_as_sequence*/
    0,                                        /*tp_as_mapping*/
    0,                                        /*tp_hash */
    CLazyLinker_call,                         /*tp_call*/
    0,                                        /*tp_str*/
    0,                                        /*tp_getattro*/
    0,                                        /*tp_setattro*/
    0,                                        /*tp_as_buffer*/
    Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE, /*tp_flags*/
    "CLazyLinker object",                     /* tp_doc */
    0,                                        /* tp_traverse */
    0,                                        /* tp_clear */
    0,                                        /* tp_richcompare */
    0,                                        /* tp_weaklistoffset */
    0,                                        /* tp_iter */
    0,                                        /* tp_iternext */
    0,                          // CLazyLinker_methods,       /* tp_methods */
    CLazyLinker_members,        /* tp_members */
    CLazyLinker_getset,         /* tp_getset */
    0,                          /* tp_base */
    0,                          /* tp_dict */
    0,                          /* tp_descr_get */
    0,                          /* tp_descr_set */
    0,                          /* tp_dictoffset */
    (initproc)CLazyLinker_init, /* tp_init */
    0,                          /* tp_alloc */
    CLazyLinker_new,            /* tp_new */
};

static PyObject *get_version(PyObject *dummy, PyObject *args) {
  PyObject *result = PyFloat_FromDouble(0.31);
  return result;
}

static PyMethodDef lazylinker_ext_methods[] = {
    {"get_version", get_version, METH_VARARGS, "Get extension version."},
    {NULL, NULL, 0, NULL} /* Sentinel */
};


static struct PyModuleDef moduledef = {PyModuleDef_HEAD_INIT,
                                       "lazylinker_ext",
                                       NULL,
                                       -1,
                                       lazylinker_ext_methods,
                                       NULL,
                                       NULL,
                                       NULL,
                                       NULL};

PyMODINIT_FUNC PyInit_lazylinker_ext(void) {

  PyObject *m;

  lazylinker_ext_CLazyLinkerType.tp_new = PyType_GenericNew;
  if (PyType_Ready(&lazylinker_ext_CLazyLinkerType) < 0)
    return NULL;

  m = PyModule_Create(&moduledef);
  Py_INCREF(&lazylinker_ext_CLazyLinkerType);
  PyModule_AddObject(m, "CLazyLinker",
                     (PyObject *)&lazylinker_ext_CLazyLinkerType);

  return m;
}

@ricardoV94
Copy link
Member

ricardoV94 commented May 30, 2025

Any, within the compiledir, there is a mod.cpp with the following contents. Is this of any use?

That's the PyTensor C module used to evaluate compiled graphs yes. It's compiled the first time you define a PyTensor function in the default backend

@ricardoV94
Copy link
Member

somehow, gcc15 decides that clock_gettime64 is required implicitly because of other functionality used.

Do you have any up to date traceback / file it tried to compile that goes in that direction?

@mvds314
Copy link
Author

mvds314 commented May 30, 2025

Sure, some up to date stack traces!

One frome the terminal:

You can find the C code in this temporary file: C:\Users\USERNAME\AppData\Local\Temp\pytensor_compilation_error_mgsnumrb
Traceback (most recent call last):
File "C:...\WPy64-31330\envs\MYENV\Lib\site-packages\pytensor\link\c\lazylinker_c.py", line 66, in
raise ImportError(
...<3 lines>...
)
ImportError: Version check of the existing lazylinker compiled file. Looking for version 0.31, but found None. Extra debug information: force_compile=False, _need_reload=True

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "C:...\WPy64-31330\envs\MYENV\Lib\site-packages\pytensor\link\c\lazylinker_c.py", line 87, in
raise ImportError(
...<3 lines>...
)
ImportError: Version check of the existing lazylinker compiled file. Looking for version 0.31, but found None. Extra debug information: force_compile=False, _need_reload=True

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "c:\Users\USERNAME\Repos\my_scripts\example_pytensor.py", line 23, in
f = pytensor.function([a, b], c)
File "C:...\WPy64-31330\envs\MYENV\Lib\site-packages\pytensor\compile\function_init_.py", line 332, in function
fn = pfunc(
params=inputs,
...<12 lines>...
trust_input=trust_input,
)
File "C:...\WPy64-31330\envs\MYENV\Lib\site-packages\pytensor\compile\function\pfunc.py", line 466, in pfunc
return orig_function(
inputs,
...<8 lines>...
trust_input=trust_input,
)
File "C:...\WPy64-31330\envs\MYENV\Lib\site-packages\pytensor\compile\function\types.py", line 1835, in orig_function
fn = m.create(defaults)
File "C:...\WPy64-31330\envs\MYENV\Lib\site-packages\pytensor\compile\function\types.py", line 1719, in create
_fn, _i, _o = self.linker.make_thunk(
~~~~~~~~~~~~~~~~~~~~~~^
input_storage=input_storage_lists, storage_map=storage_map
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
)
^
File "C:...\WPy64-31330\envs\MYENV\Lib\site-packages\pytensor\link\basic.py", line 245, in make_thunk
return self.make_all(
~~~~~~~~~~~~~^
input_storage=input_storage,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
output_storage=output_storage,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
storage_map=storage_map,
^^^^^^^^^^^^^^^^^^^^^^^^
)[:3]
^
File "C:...\WPy64-31330\envs\MYENV\Lib\site-packages\pytensor\link\vm.py", line 1285, in make_all
vm = self.make_vm(
order,
...<7 lines>...
self.updated_vars,
)
File "C:...\WPy64-31330\envs\MYENV\Lib\site-packages\pytensor\link\vm.py", line 1013, in make_vm
from pytensor.link.c.cvm import CVM
File "C:...\WPy64-31330\envs\MYENV\Lib\site-packages\pytensor\link\c\cvm.py", line 13, in
from pytensor.link.c.lazylinker_c import CLazyLinker
File "C:...\WPy64-31330\envs\MYENV\Lib\site-packages\pytensor\link\c\lazylinker_c.py", line 122, in
GCC_compiler.compile_str(dirname, code, location=loc, preargs=args)
~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:...\WPy64-31330\envs\MYENV\Lib\site-packages\pytensor\link\c\cmodule.py", line 2678, in compile_str
raise CompileError(
f"Compilation failed (return status={status}):\n{' '.join(cmd)}\n{compile_stderr}"
)
pytensor.link.c.exceptions.CompileError: Compilation failed (return status=1):
"C:...\msys64\mingw64\bin\g++.EXE" -shared -g -DNPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION -m64 -DMS_WIN64 -I"C:...\WPy64-31330\python\Lib\site-packages\numpy_core\include" -I"C:...\WPy64-31330\python\include" -I"C:...\WPy64-31330\envs\MYENV\Lib\site-packages\pytensor\link\c\c_code" -L"C:...\WPy64-31330\python\libs" -L"C:...\WPy64-31330\python" -o "C:\Users\USERNAME\AppData\Local\PyTensor\compiledir_Windows-11-10.0.22631-SP0-Intel64_Family_6_Model_186_Stepping_2_GenuineIntel-3.13.3-64\lazylinker_ext\lazylinker_ext.pyd" "C:\Users\USERNAME\AppData\Local\PyTensor\compiledir_Windows-11-10.0.22631-SP0-Intel64_Family_6_Model_186_Stepping_2_GenuineIntel-3.13.3-64\lazylinker_ext\mod.cpp" "C:...\WPy64-31330\python\python313.dll"

And the contents of: pytensor_compilation_error_mgsnumrb

===============================
1	#include "pytensor_mod_helper.h"
2	#include "structmember.h"
3	#include <Python.h>
4	#include <sys/time.h>
5	
6	#if PY_VERSION_HEX >= 0x03000000
7	#include "numpy/npy_3kcompat.h"
8	#endif
9	
10	#ifndef Py_TYPE
11	#define Py_TYPE(obj) obj->ob_type
12	#endif
13	
14	/**
15	
16	TODO:
17	- Check max supported depth of recursion
18	- CLazyLinker should add context information to errors caught during evaluation.
19	Say what node we were on, add the traceback attached to the node.
20	- Clear containers of fully-useed intermediate results if allow_gc is 1
21	- Add timers for profiling
22	- Add support for profiling space used.
23	
24	
25	  */
26	static double pytime(const struct timeval *tv) {
27	  struct timeval t;
28	  if (!tv) {
29	    tv = &t;
30	    gettimeofday(&t, NULL);
31	  }
32	  return (double)tv->tv_sec + (double)tv->tv_usec / 1000000.0;
33	}
34	
35	/**
36	  Helper routine to convert a PyList of integers to a c array of integers.
37	  */
38	static int unpack_list_of_ssize_t(PyObject *pylist, Py_ssize_t **dst,
39	                                  Py_ssize_t *len, const char *kwname) {
40	  Py_ssize_t buflen, *buf;
41	  if (!PyList_Check(pylist)) {
42	    PyErr_Format(PyExc_TypeError, "%s must be list", kwname);
43	    return -1;
44	  }
45	  assert(NULL == *dst);
46	  *len = buflen = PyList_Size(pylist);
47	  *dst = buf = (Py_ssize_t *)calloc(buflen, sizeof(Py_ssize_t));
48	  assert(buf);
49	  for (int ii = 0; ii < buflen; ++ii) {
50	    PyObject *el_i = PyList_GetItem(pylist, ii);
51	    Py_ssize_t n_i = PyNumber_AsSsize_t(el_i, PyExc_IndexError);
52	    if (PyErr_Occurred()) {
53	      free(buf);
54	      *dst = NULL;
55	      return -1;
56	    }
57	    buf[ii] = n_i;
58	  }
59	  return 0;
60	}
61	
62	static int unpack_nested_tuples_of_ssize_t(PyObject *pytuple, Py_ssize_t **dst,
63	                                           Py_ssize_t *len,
64	                                           const char *kwname) {
65	  Py_ssize_t buflen, *buf;
66	  if (!PyTuple_Check(pytuple)) {
67	    PyErr_Format(PyExc_TypeError, "%s must be a tuple of tuples", kwname);
68	    return -1;
69	  }
70	  *len = buflen = PyTuple_Size(pytuple);
71	  *dst = buf = (Py_ssize_t *) malloc(sizeof(Py_ssize_t *) * buflen * 2);
72	  assert(buf);
73	  for (int ii = 0; ii < buflen; ++ii) {
74	    PyObject *el_i = PyTuple_GetItem(pytuple, ii);
75	
76	    PyObject *el_i_1 = PyTuple_GetItem(el_i, 0);
77	    PyObject *el_i_2 = PyTuple_GetItem(el_i, 1);
78	
79	    Py_ssize_t n_1 = PyNumber_AsSsize_t(el_i_1, PyExc_IndexError);
80	    Py_ssize_t n_2 = PyNumber_AsSsize_t(el_i_2, PyExc_IndexError);
81	
82	    if (PyErr_Occurred()) {
83	      free(buf);
84	      dst = NULL;
85	      return -1;
86	    }
87	
88	    buf[ii * 2 + 0] = n_1;
89	    buf[ii * 2 + 1] = n_2;
90	  }
91	  return 0;
92	}
93	
94	/**
95	
96	  CLazyLinker
97	
98	
99	  */
100	typedef struct {
101	  PyObject_HEAD
102	      /* Type-specific fields go here. */
103	      PyObject *nodes;      // the python list of nodes
104	  PyObject *thunks;         // python list of thunks
105	  PyObject *pre_call_clear; // list of cells to clear on call.
106	  int allow_gc;
107	  Py_ssize_t n_applies;
108	  int n_vars;        // number of variables in the graph
109	  int *var_computed; // 1 or 0 for every variable
110	  PyObject **var_computed_cells;
111	  PyObject **var_value_cells;
112	  Py_ssize_t **dependencies; // list of vars dependencies for GC
113	  Py_ssize_t *n_dependencies;
114	
115	  Py_ssize_t n_output_vars;
116	  Py_ssize_t *output_vars; // variables that *must* be evaluated by call
117	
118	  int *is_lazy; // 1 or 0 for every thunk
119	
120	  Py_ssize_t *var_owner; // nodes[[var_owner[var_idx]]] is var[var_idx]->owner
121	  int *var_has_owner;    //  1 or 0
122	
123	  Py_ssize_t *node_n_inputs;
124	  Py_ssize_t *node_n_outputs;
125	  Py_ssize_t **node_inputs;
126	  Py_ssize_t **node_outputs;
127	  Py_ssize_t
128	      *node_inputs_outputs_base; // node_inputs and node_outputs point into this
129	  Py_ssize_t *node_n_prereqs;
130	  Py_ssize_t **node_prereqs;
131	
132	  Py_ssize_t *update_storage; // (input_idx, output_idx) pairs specifying
133	                              // output-to-input updates
134	  Py_ssize_t n_updates;
135	
136	  void **thunk_cptr_fn;
137	  void **thunk_cptr_data;
138	  PyObject *call_times;
139	  PyObject *call_counts;
140	  int do_timing;
141	  int need_update_inputs;
142	  int position_of_error; // -1 for no error, otw the index into `thunks` that
143	                         // failed.
144	} CLazyLinker;
145	
146	static void CLazyLinker_dealloc(PyObject *_self) {
147	  CLazyLinker *self = (CLazyLinker *)_self;
148	  free(self->thunk_cptr_fn);
149	  free(self->thunk_cptr_data);
150	
151	  free(self->is_lazy);
152	
153	  free(self->update_storage);
154	
155	  if (self->node_n_prereqs) {
156	    for (int i = 0; i < self->n_applies; ++i) {
157	      free(self->node_prereqs[i]);
158	    }
159	  }
160	  free(self->node_n_prereqs);
161	  free(self->node_prereqs);
162	  free(self->node_inputs_outputs_base);
163	  free(self->node_n_inputs);
164	  free(self->node_n_outputs);
165	  free(self->node_inputs);
166	  free(self->node_outputs);
167	
168	  if (self->dependencies) {
169	    for (int i = 0; i < self->n_vars; ++i) {
170	      free(self->dependencies[i]);
171	    }
172	    free(self->dependencies);
173	    free(self->n_dependencies);
174	  }
175	
176	  free(self->var_owner);
177	  free(self->var_has_owner);
178	  free(self->var_computed);
179	  if (self->var_computed_cells) {
180	    for (int i = 0; i < self->n_vars; ++i) {
181	      Py_DECREF(self->var_computed_cells[i]);
182	      Py_DECREF(self->var_value_cells[i]);
183	    }
184	  }
185	  free(self->var_computed_cells);
186	  free(self->var_value_cells);
187	  free(self->output_vars);
188	
189	  Py_XDECREF(self->nodes);
190	  Py_XDECREF(self->thunks);
191	  Py_XDECREF(self->call_times);
192	  Py_XDECREF(self->call_counts);
193	  Py_XDECREF(self->pre_call_clear);
194	  Py_TYPE(self)->tp_free((PyObject *)self);
195	}
196	static PyObject *CLazyLinker_new(PyTypeObject *type, PyObject *args,
197	                                 PyObject *kwds) {
198	  CLazyLinker *self;
199	
200	  self = (CLazyLinker *)type->tp_alloc(type, 0);
201	  if (self != NULL) {
202	    self->nodes = NULL;
203	    self->thunks = NULL;
204	    self->pre_call_clear = NULL;
205	
206	    self->allow_gc = 1;
207	    self->n_applies = 0;
208	    self->n_vars = 0;
209	    self->var_computed = NULL;
210	    self->var_computed_cells = NULL;
211	    self->var_value_cells = NULL;
212	    self->dependencies = NULL;
213	    self->n_dependencies = NULL;
214	
215	    self->n_output_vars = 0;
216	    self->output_vars = NULL;
217	
218	    self->is_lazy = NULL;
219	
220	    self->var_owner = NULL;
221	    self->var_has_owner = NULL;
222	
223	    self->node_n_inputs = NULL;
224	    self->node_n_outputs = NULL;
225	    self->node_inputs = NULL;
226	    self->node_outputs = NULL;
227	    self->node_inputs_outputs_base = NULL;
228	    self->node_prereqs = NULL;
229	    self->node_n_prereqs = NULL;
230	
231	    self->update_storage = NULL;
232	    self->n_updates = 0;
233	
234	    self->thunk_cptr_data = NULL;
235	    self->thunk_cptr_fn = NULL;
236	    self->call_times = NULL;
237	    self->call_counts = NULL;
238	    self->do_timing = 0;
239	
240	    self->need_update_inputs = 0;
241	    self->position_of_error = -1;
242	  }
243	  return (PyObject *)self;
244	}
245	
246	static int CLazyLinker_init(CLazyLinker *self, PyObject *args, PyObject *kwds) {
247	  static char *kwlist[] = {(char *)"nodes",
248	                           (char *)"thunks",
249	                           (char *)"pre_call_clear",
250	                           (char *)"allow_gc",
251	                           (char *)"call_counts",
252	                           (char *)"call_times",
253	                           (char *)"compute_map_list",
254	                           (char *)"storage_map_list",
255	                           (char *)"base_input_output_list",
256	                           (char *)"node_n_inputs",
257	                           (char *)"node_n_outputs",
258	                           (char *)"node_input_offset",
259	                           (char *)"node_output_offset",
260	                           (char *)"var_owner",
261	                           (char *)"is_lazy_list",
262	                           (char *)"output_vars",
263	                           (char *)"node_prereqs",
264	                           (char *)"node_output_size",
265	                           (char *)"update_storage",
266	                           (char *)"dependencies",
267	                           NULL};
268	
269	  PyObject *compute_map_list = NULL, *storage_map_list = NULL,
270	           *base_input_output_list = NULL, *node_n_inputs = NULL,
271	           *node_n_outputs = NULL, *node_input_offset = NULL,
272	           *node_output_offset = NULL, *var_owner = NULL, *is_lazy = NULL,
273	           *output_vars = NULL, *node_prereqs = NULL, *node_output_size = NULL,
274	    *dependencies = NULL, *update_storage=NULL;
275	
276	  assert(!self->nodes);
277	  if (!PyArg_ParseTupleAndKeywords(
278	          args, kwds, "OOOiOOOOOOOOOOOOOOOO", kwlist, &self->nodes,
279	          &self->thunks, &self->pre_call_clear, &self->allow_gc,
280	          &self->call_counts, &self->call_times, &compute_map_list,
281	          &storage_map_list, &base_input_output_list, &node_n_inputs,
282	          &node_n_outputs, &node_input_offset, &node_output_offset, &var_owner,
283	          &is_lazy, &output_vars, &node_prereqs, &node_output_size,
284	          &update_storage, &dependencies))
285	    return -1;
286	  Py_INCREF(self->nodes);
287	  Py_INCREF(self->thunks);
288	  Py_INCREF(self->pre_call_clear);
289	  Py_INCREF(self->call_counts);
290	  Py_INCREF(self->call_times);
291	
292	  Py_ssize_t n_applies = PyList_Size(self->nodes);
293	
294	  self->n_applies = n_applies;
295	  self->n_vars = PyList_Size(var_owner);
296	
297	  if (PyList_Size(self->thunks) != n_applies)
298	    return -1;
299	  if (PyList_Size(self->call_counts) != n_applies)
300	    return -1;
301	  if (PyList_Size(self->call_times) != n_applies)
302	    return -1;
303	
304	  // allocated and initialize thunk_cptr_data and thunk_cptr_fn
305	  if (n_applies) {
306	    self->thunk_cptr_data = (void **)calloc(n_applies, sizeof(void *));
307	    self->thunk_cptr_fn = (void **)calloc(n_applies, sizeof(void *));
308	    self->is_lazy = (int *)calloc(n_applies, sizeof(int));
309	    self->node_prereqs = (Py_ssize_t **)calloc(n_applies, sizeof(Py_ssize_t *));
310	    self->node_n_prereqs = (Py_ssize_t *)calloc(n_applies, sizeof(Py_ssize_t));
311	    assert(self->node_prereqs);
312	    assert(self->node_n_prereqs);
313	    assert(self->is_lazy);
314	    assert(self->thunk_cptr_fn);
315	    assert(self->thunk_cptr_data);
316	
317	    for (int i = 0; i < n_applies; ++i) {
318	      PyObject *thunk = PyList_GetItem(self->thunks, i);
319	      // thunk is borrowed
320	      if (PyObject_HasAttrString(thunk, "cthunk")) {
321	        PyObject *cthunk = PyObject_GetAttrString(thunk, "cthunk");
322	        // new reference
323	        assert(cthunk && NpyCapsule_Check(cthunk));
324	        self->thunk_cptr_fn[i] = NpyCapsule_AsVoidPtr(cthunk);
325	        self->thunk_cptr_data[i] = NpyCapsule_GetDesc(cthunk);
326	        Py_DECREF(cthunk);
327	        // cthunk is kept alive by membership in self->thunks
328	      }
329	
330	      PyObject *el_i = PyList_GetItem(is_lazy, i);
331	      self->is_lazy[i] = PyNumber_AsSsize_t(el_i, NULL);
332	
333	      /* now get the prereqs */
334	      el_i = PyList_GetItem(node_prereqs, i);
335	      assert(PyList_Check(el_i));
336	      self->node_n_prereqs[i] = PyList_Size(el_i);
337	      if (self->node_n_prereqs[i]) {
338	        self->node_prereqs[i] =
339	            (Py_ssize_t *)malloc(PyList_Size(el_i) * sizeof(Py_ssize_t));
340	        for (int j = 0; j < PyList_Size(el_i); ++j) {
341	          PyObject *el_ij = PyList_GetItem(el_i, j);
342	          Py_ssize_t N = PyNumber_AsSsize_t(el_ij, PyExc_IndexError);
343	          if (PyErr_Occurred())
344	            return -1;
345	          // N < n. variables
346	          assert(N < PyList_Size(var_owner));
347	          self->node_prereqs[i][j] = N;
348	        }
349	      }
350	    }
351	  }
352	  if (PyList_Check(base_input_output_list)) {
353	    Py_ssize_t n_inputs_outputs_base = PyList_Size(base_input_output_list);
354	    self->node_inputs_outputs_base =
355	        (Py_ssize_t *)calloc(n_inputs_outputs_base, sizeof(Py_ssize_t));
356	    assert(self->node_inputs_outputs_base);
357	    for (int i = 0; i < n_inputs_outputs_base; ++i) {
358	      PyObject *el_i = PyList_GetItem(base_input_output_list, i);
359	      Py_ssize_t idx = PyNumber_AsSsize_t(el_i, PyExc_IndexError);
360	      if (PyErr_Occurred())
361	        return -1;
362	      self->node_inputs_outputs_base[i] = idx;
363	    }
364	    self->node_n_inputs = (Py_ssize_t *)calloc(n_applies, sizeof(Py_ssize_t));
365	    assert(self->node_n_inputs);
366	    self->node_n_outputs = (Py_ssize_t *)calloc(n_applies, sizeof(Py_ssize_t));
367	    assert(self->node_n_outputs);
368	    self->node_inputs = (Py_ssize_t **)calloc(n_applies, sizeof(Py_ssize_t *));
369	    assert(self->node_inputs);
370	    self->node_outputs = (Py_ssize_t **)calloc(n_applies, sizeof(Py_ssize_t *));
371	    assert(self->node_outputs);
372	    for (int i = 0; i < n_applies; ++i) {
373	      Py_ssize_t N;
374	      N = PyNumber_AsSsize_t(PyList_GetItem(node_n_inputs, i),
375	                             PyExc_IndexError);
376	      if (PyErr_Occurred())
377	        return -1;
378	      assert(N <= n_inputs_outputs_base);
379	      self->node_n_inputs[i] = N;
380	      N = PyNumber_AsSsize_t(PyList_GetItem(node_n_outputs, i),
381	                             PyExc_IndexError);
382	      if (PyErr_Occurred())
383	        return -1;
384	      assert(N <= n_inputs_outputs_base);
385	      self->node_n_outputs[i] = N;
386	      N = PyNumber_AsSsize_t(PyList_GetItem(node_input_offset, i),
387	                             PyExc_IndexError);
388	      if (PyErr_Occurred())
389	        return -1;
390	      assert(N <= n_inputs_outputs_base);
391	      self->node_inputs[i] = &self->node_inputs_outputs_base[N];
392	      N = PyNumber_AsSsize_t(PyList_GetItem(node_output_offset, i),
393	                             PyExc_IndexError);
394	      if (PyErr_Occurred())
395	        return -1;
396	      assert(N <= n_inputs_outputs_base);
397	      self->node_outputs[i] = &self->node_inputs_outputs_base[N];
398	    }
399	  } else {
400	    PyErr_SetString(PyExc_TypeError, "base_input_output_list must be list");
401	    return -1;
402	  }
403	
404	  // allocation for var_owner
405	  if (PyList_Check(var_owner)) {
406	    self->var_owner = (Py_ssize_t *)calloc(self->n_vars, sizeof(Py_ssize_t));
407	    self->var_has_owner = (int *)calloc(self->n_vars, sizeof(int));
408	    self->var_computed = (int *)calloc(self->n_vars, sizeof(int));
409	    self->var_computed_cells =
410	        (PyObject **)calloc(self->n_vars, sizeof(PyObject *));
411	    self->var_value_cells =
412	        (PyObject **)calloc(self->n_vars, sizeof(PyObject *));
413	    for (int i = 0; i < self->n_vars; ++i) {
414	      PyObject *el_i = PyList_GetItem(var_owner, i);
415	      if (el_i == Py_None) {
416	        self->var_has_owner[i] = 0;
417	      } else {
418	        Py_ssize_t N = PyNumber_AsSsize_t(el_i, PyExc_IndexError);
419	        if (PyErr_Occurred())
420	          return -1;
421	        assert(N <= n_applies);
422	        self->var_owner[i] = N;
423	        self->var_has_owner[i] = 1;
424	      }
425	      self->var_computed_cells[i] = PyList_GetItem(compute_map_list, i);
426	      Py_INCREF(self->var_computed_cells[i]);
427	      self->var_value_cells[i] = PyList_GetItem(storage_map_list, i);
428	      Py_INCREF(self->var_value_cells[i]);
429	    }
430	  } else {
431	    PyErr_SetString(PyExc_TypeError, "var_owner must be list");
432	    return -1;
433	  }
434	
435	  if (dependencies != Py_None) {
436	    self->dependencies =
437	        (Py_ssize_t **)calloc(self->n_vars, sizeof(Py_ssize_t *));
438	    self->n_dependencies =
439	        (Py_ssize_t *)calloc(self->n_vars, sizeof(Py_ssize_t));
440	    assert(self->dependencies);
441	    assert(self->n_dependencies);
442	
443	    for (int i = 0; i < self->n_vars; ++i) {
444	      PyObject *tmp = PyList_GetItem(dependencies, i);
445	      // refcounting - tmp is borrowed
446	      if (unpack_list_of_ssize_t(tmp, &self->dependencies[i],
447	                                 &self->n_dependencies[i], "dependencies"))
448	        return -1;
449	    }
450	  }
451	
452	  if (unpack_list_of_ssize_t(output_vars, &self->output_vars,
453	                             &self->n_output_vars, "output_vars"))
454	    return -1;
455	
456	  for (int i = 0; i < self->n_output_vars; ++i) {
457	    assert(self->output_vars[i] < self->n_vars);
458	  }
459	
460	  if (unpack_nested_tuples_of_ssize_t(update_storage, &self->update_storage,
461	                                      &self->n_updates, "updates_storage"))
462	    return -1;
463	
464	  return 0;
465	}
466	static void set_position_of_error(CLazyLinker *self, int owner_idx) {
467	  if (self->position_of_error == -1) {
468	    self->position_of_error = owner_idx;
469	  }
470	}
471	static PyObject *pycall(CLazyLinker *self, Py_ssize_t node_idx, int verbose) {
472	  // call thunk to see which inputs it wants
473	  PyObject *thunk = PyList_GetItem(self->thunks, node_idx);
474	  // refcounting - thunk is borrowed
475	  PyObject *rval = NULL;
476	  if (self->do_timing) {
477	    double t0 = pytime(NULL);
478	    if (verbose)
479	      fprintf(stderr, "calling via Python (node %i)\n", (int)node_idx);
480	    rval = PyObject_CallNoArgs(thunk);
481	    if (rval) {
482	      double t1 = pytime(NULL);
483	      double ti = PyFloat_AsDouble(PyList_GetItem(self->call_times, node_idx));
484	      PyList_SetItem(self->call_times, node_idx,
485	                     PyFloat_FromDouble(t1 - t0 + ti));
486	      PyObject *count = PyList_GetItem(self->call_counts, node_idx);
487	      long icount = PyLong_AsLong(count);
488	      PyList_SetItem(self->call_counts, node_idx, PyLong_FromLong(icount + 1));
489	    }
490	  } else {
491	    if (verbose) {
492	      fprintf(stderr, "calling via Python (node %i)\n", (int)node_idx);
493	    }
494	    rval = PyObject_CallNoArgs(thunk);
495	  }
496	  return rval;
497	}
498	static int c_call(CLazyLinker *self, Py_ssize_t node_idx, int verbose) {
499	  void *ptr_addr = self->thunk_cptr_fn[node_idx];
500	  int (*fn)(void *) = (int (*)(void *))(ptr_addr);
501	  if (verbose)
502	    fprintf(stderr, "calling non-lazy shortcut (node %i)\n", (int)node_idx);
503	  int err = 0;
504	  if (self->do_timing) {
505	    double t0 = pytime(NULL);
506	    err = fn(self->thunk_cptr_data[node_idx]);
507	    double t1 = pytime(NULL);
508	    double ti = PyFloat_AsDouble(PyList_GetItem(self->call_times, node_idx));
509	    PyList_SetItem(self->call_times, node_idx,
510	                   PyFloat_FromDouble(t1 - t0 + ti));
511	    PyObject *count = PyList_GetItem(self->call_counts, node_idx);
512	    long icount = PyLong_AsLong(count);
513	    PyList_SetItem(self->call_counts, node_idx, PyLong_FromLong(icount + 1));
514	  } else {
515	    err = fn(self->thunk_cptr_data[node_idx]);
516	  }
517	
518	  if (err) {
519	    // cast the argument to a PyList (as described near line 226 of cc.py)
520	    PyObject *__ERROR = ((PyObject **)self->thunk_cptr_data[node_idx])[0];
521	    assert(PyList_Check(__ERROR));
522	    assert(PyList_Size(__ERROR) == 3);
523	    PyObject *err_type = PyList_GetItem(__ERROR, 0);  // stolen ref
524	    PyObject *err_msg = PyList_GetItem(__ERROR, 1);   // stolen ref
525	    PyObject *err_trace = PyList_GetItem(__ERROR, 2); // stolen ref
526	    PyList_SET_ITEM(__ERROR, 0, Py_None);
527	    Py_INCREF(Py_None); // clobbers old ref
528	    PyList_SET_ITEM(__ERROR, 1, Py_None);
529	    Py_INCREF(Py_None); // clobbers old ref
530	    PyList_SET_ITEM(__ERROR, 2, Py_None);
531	    Py_INCREF(Py_None); // clobbers old ref
532	
533	    assert(!PyErr_Occurred()); // because CLinker hid the exception in __ERROR
534	                               // aka data
535	    PyErr_Restore(err_type, err_msg, err_trace); // steals refs to args
536	  }
537	  if (err)
538	    set_position_of_error(self, node_idx);
539	  return err;
540	}
541	static int lazy_rec_eval(CLazyLinker *self, Py_ssize_t var_idx, PyObject *one,
542	                         PyObject *zero) {
543	  PyObject *rval = NULL;
544	  int verbose = 0;
545	  int err = 0;
546	
547	  if (verbose)
548	    fprintf(stderr, "lazy_rec computing %i\n", (int)var_idx);
549	
550	  if (self->var_computed[var_idx] || !self->var_has_owner[var_idx])
551	    return 0;
552	
553	  Py_ssize_t owner_idx = self->var_owner[var_idx];
554	
555	  // STEP 1: compute the pre-requirements of the node
556	  // Includes input nodes for non-lazy ops.
557	  for (int i = 0; i < self->node_n_prereqs[owner_idx]; ++i) {
558	    Py_ssize_t prereq_idx = self->node_prereqs[owner_idx][i];
559	    if (!self->var_computed[prereq_idx]) {
560	      err = lazy_rec_eval(self, prereq_idx, one, zero);
561	      if (err)
562	        return err;
563	    }
564	    assert(self->var_computed[prereq_idx]);
565	  }
566	
567	  // STEP 2: compute the node itself
568	  if (self->is_lazy[owner_idx]) {
569	    // update the compute_map cells corresponding to the inputs of this thunk
570	    for (int i = 0; i < self->node_n_inputs[owner_idx]; ++i) {
571	      int in_idx = self->node_inputs[owner_idx][i];
572	      if (self->var_computed[in_idx]) {
573	        Py_INCREF(one);
574	        err = PyList_SetItem(self->var_computed_cells[in_idx], 0, one);
575	      } else {
576	        Py_INCREF(zero);
577	        err = PyList_SetItem(self->var_computed_cells[in_idx], 0, zero);
578	      }
579	      if (err)
580	        goto fail;
581	    }
582	
583	    rval = pycall(self, owner_idx, verbose);
584	    // refcounting - rval is new ref
585	    // TODO: to prevent infinite loops
586	    // - consider check that a thunk does not ask for an input that is already
587	    // computed
588	    if (rval == NULL) {
589	      assert(PyErr_Occurred());
590	      err = 1;
591	      goto fail;
592	    }
593	
594	    // update the computed-ness of any output cells
595	    for (int i = 0; i < self->node_n_outputs[owner_idx]; ++i) {
596	      int out_idx = self->node_outputs[owner_idx][i];
597	      PyObject *el_i = PyList_GetItem(self->var_computed_cells[out_idx], 0);
598	      Py_ssize_t N = PyNumber_AsSsize_t(el_i, PyExc_IndexError);
599	      if (PyErr_Occurred()) {
600	        err = -1;
601	        goto pyfail;
602	      }
603	      assert(N == 0 || N == 1);
604	      self->var_computed[out_idx] = N;
605	    }
606	    if (!self->var_computed[var_idx]) {
607	      /*
608	       * If self is not computed after the call, this means that some
609	       * inputs are needed.  Compute the ones on the returned list
610	       * and try to compute the current node again (with recursive call).
611	       * This allows a node to request more nodes more than once before
612	       * finally yielding a result.
613	       */
614	      if (!PyList_Check(rval)) {
615	        // TODO: More helpful error to help find *which node* made this
616	        // bad thunk
617	        PyErr_SetString(PyExc_TypeError, "lazy thunk should return a list");
618	        err = 1;
619	        goto pyfail;
620	      }
621	
622	      if (!PyList_Size(rval)) {
623	        PyErr_SetString(
624	            PyExc_ValueError,
625	            "lazy thunk returned empty list without computing output");
626	        err = 1;
627	        goto pyfail;
628	      }
629	
630	      for (int i = 0; i < PyList_Size(rval); ++i) {
631	        PyObject *el_i = PyList_GetItem(rval, i);
632	        Py_ssize_t N = PyNumber_AsSsize_t(el_i, PyExc_IndexError);
633	        if (PyErr_Occurred()) {
634	          err = 1;
635	          goto pyfail;
636	        }
637	        assert(N <= self->node_n_inputs[owner_idx]);
638	        Py_ssize_t input_idx = self->node_inputs[owner_idx][N];
639	        err = lazy_rec_eval(self, input_idx, one, zero);
640	        if (err)
641	          goto pyfail;
642	      }
643	
644	      Py_DECREF(rval);
645	      /*
646	       * We intentionally skip all the end-of-function processing
647	       * (mark outputs, GC) as it will be performed by the call
648	       * that actually manages to compute the result.
649	       */
650	      return lazy_rec_eval(self, var_idx, one, zero);
651	    }
652	
653	    Py_DECREF(rval);
654	  } else // owner is not a lazy op. Ensure all inputs are evaluated.
655	  {
656	    // loop over inputs to owner
657	    // call lazy_rec_eval on each one that is not computed.
658	    // if there's an error, pass it up the stack
659	    for (int i = 0; i < self->node_n_inputs[owner_idx]; ++i) {
660	      Py_ssize_t input_idx = self->node_inputs[owner_idx][i];
661	      if (!self->var_computed[input_idx]) {
662	        err = lazy_rec_eval(self, input_idx, one, zero);
663	        if (err)
664	          return err;
665	      }
666	      assert(self->var_computed[input_idx]);
667	    }
668	
669	    // call the thunk for this owner.
670	    if (self->thunk_cptr_fn[owner_idx]) {
671	      err = c_call(self, owner_idx, verbose);
672	      if (err)
673	        goto fail;
674	    } else {
675	      rval = pycall(self, owner_idx, verbose);
676	      // rval is new ref
677	      if (rval) // pycall returned normally (no exception)
678	      {
679	        if (rval == Py_None) {
680	          Py_DECREF(rval); // ignore a return of None
681	        } else if (PyList_Check(rval)) {
682	          PyErr_SetString(PyExc_TypeError,
683	                          "non-lazy thunk should return None, not list");
684	          err = 1;
685	          goto pyfail;
686	        } else // don't know what it returned, but it wasn't right.
687	        {
688	          PyErr_SetObject(PyExc_TypeError, rval);
689	          err = 1;
690	          // We don't release rval since we put it in the error above
691	          goto fail;
692	        }
693	      } else // pycall returned NULL (internal error)
694	      {
695	        err = 1;
696	        goto fail;
697	      }
698	    }
699	  }
700	
701	  // loop over all outputs and mark them as computed
702	  for (int i = 0; i < self->node_n_outputs[owner_idx]; ++i) {
703	    self->var_computed[self->node_outputs[owner_idx][i]] = 1;
704	  }
705	
706	  // Free vars that are not needed anymore
707	  if (self->allow_gc) {
708	    for (int i = 0; i < self->node_n_inputs[owner_idx]; ++i) {
709	      int cleanup = 1;
710	      Py_ssize_t i_idx = self->node_inputs[owner_idx][i];
711	      if (!self->var_has_owner[i_idx])
712	        continue;
713	
714	      for (int j = 0; j < self->n_output_vars; ++j) {
715	        if (i_idx == self->output_vars[j]) {
716	          cleanup = 0;
717	          break;
718	        }
719	      }
720	      if (!cleanup)
721	        continue;
722	
723	      for (int j = 0; j < self->n_dependencies[i_idx]; ++j) {
724	        if (!self->var_computed[self->dependencies[i_idx][j]]) {
725	          cleanup = 0;
726	          break;
727	        }
728	      }
729	      if (!cleanup)
730	        continue;
731	
732	      Py_INCREF(Py_None);
733	      err = PyList_SetItem(self->var_value_cells[i_idx], 0, Py_None);
734	      // See the Stack gc implementation for why we change it to 2 and not 0.
735	      self->var_computed[i_idx] = 2;
736	      if (err)
737	        goto fail;
738	    }
739	  }
740	
741	  return 0;
742	pyfail:
743	  Py_DECREF(rval);
744	fail:
745	  set_position_of_error(self, owner_idx);
746	  return err;
747	}
748	
749	static PyObject *CLazyLinker_call(PyObject *_self, PyObject *args,
750	                                  PyObject *kwds) {
751	  CLazyLinker *self = (CLazyLinker *)_self;
752	  static char *kwlist[] = {(char *)"time_thunks", (char *)"n_calls",
753	                           (char *)"output_subset", NULL};
754	  int n_calls = 1;
755	  PyObject *output_subset_ptr = NULL;
756	  if (!PyArg_ParseTupleAndKeywords(args, kwds, "|iiO", kwlist, &self->do_timing,
757	                                   &n_calls, &output_subset_ptr))
758	    return NULL;
759	
760	  int err = 0;
761	  // parse an output_subset list
762	  // it is stored as a bool list of length n_output_vars: calculate a var or not
763	  char *output_subset = NULL;
764	  int output_subset_size = -1;
765	  if (output_subset_ptr != NULL) {
766	    if (!PyList_Check(output_subset_ptr)) {
767	      err = 1;
768	      PyErr_SetString(PyExc_RuntimeError, "Output_subset is not a list");
769	    } else {
770	      output_subset_size = PyList_Size(output_subset_ptr);
771	      output_subset = (char *)calloc(self->n_output_vars, sizeof(char));
772	      for (int it = 0; it < output_subset_size; ++it) {
773	        PyObject *elem = PyList_GetItem(output_subset_ptr, it);
774	        if (!PyLong_Check(elem)) {
775	          err = 1;
776	          PyErr_SetString(PyExc_RuntimeError,
777	                          "Some elements of output_subset list are not int");
778	        }
779	        output_subset[PyLong_AsLong(elem)] = 1;
780	      }
781	    }
782	  }
783	
784	  self->position_of_error = -1;
785	  // create constants used to fill the var_compute_cells
786	  PyObject *one = PyLong_FromLong(1);
787	  PyObject *zero = PyLong_FromLong(0);
788	
789	  // pre-allocate our return value
790	  Py_INCREF(Py_None);
791	  PyObject *rval = Py_None;
792	  // clear storage of pre_call_clear elements
793	  for (int call_i = 0; call_i < n_calls && (!err); ++call_i) {
794	    Py_ssize_t n_pre_call_clear = PyList_Size(self->pre_call_clear);
795	    assert(PyList_Check(self->pre_call_clear));
796	    for (int i = 0; i < n_pre_call_clear; ++i) {
797	      PyObject *el_i = PyList_GetItem(self->pre_call_clear, i);
798	      Py_INCREF(Py_None);
799	      PyList_SetItem(el_i, 0, Py_None);
800	    }
801	    // clear the computed flag out of all non-input vars
802	    for (int i = 0; i < self->n_vars; ++i) {
803	      self->var_computed[i] = !self->var_has_owner[i];
804	      if (self->var_computed[i]) {
805	        Py_INCREF(one);
806	        PyList_SetItem(self->var_computed_cells[i], 0, one);
807	      } else {
808	        Py_INCREF(zero);
809	        PyList_SetItem(self->var_computed_cells[i], 0, zero);
810	      }
811	    }
812	
813	    int first_updated = self->n_output_vars - self->n_updates;
814	    for (int i = 0; i < self->n_output_vars && (!err); ++i) {
815	      if (i >= first_updated || output_subset == NULL ||
816	          output_subset[i] == 1) {
817	        err = lazy_rec_eval(self, self->output_vars[i], one, zero);
818	      }
819	    }
820	
821	    if (!err) {
822	      // save references to outputs prior to updating storage containers
823	      assert(self->n_output_vars >= self->n_updates);
824	      Py_DECREF(rval);
825	      rval = PyList_New(self->n_output_vars);
826	      for (int i = 0; i < (self->n_output_vars); ++i) {
827	        Py_ssize_t src = self->output_vars[i];
828	        PyObject *item = PyList_GetItem(self->var_value_cells[src], 0);
829	        if ((output_subset == NULL || output_subset[i]) &&
830	            self->var_computed[src] != 1) {
831	          err = 1;
832	          PyErr_Format(PyExc_AssertionError,
833	                       "The compute map of output %d should contain "
834	                       "1 at the end of execution, not %d.",
835	                       i, self->var_computed[src]);
836	          break;
837	        }
838	        Py_INCREF(item);
839	        PyList_SetItem(rval, i, item);
840	      }
841	    }
842	
843	    if (!err) {
844	      // Update the inputs that have an update rule
845	      for (int i = 0; i < self->n_updates; ++i) {
846	        Py_ssize_t in_idx = self->update_storage[i * 2 + 0];
847	        Py_ssize_t out_idx = self->update_storage[i * 2 + 1];
848	        PyObject *tmp = PyList_GetItem(rval, out_idx);
849	        Py_INCREF(tmp);
850	        PyList_SetItem(self->var_value_cells[in_idx], 0, tmp);
851	      }
852	    }
853	  }
854	
855	  /*
856	    Clear everything that is left and not an output. This is needed
857	    for lazy evaluation since the current GC algo is too conservative
858	    with lazy graphs.
859	  */
860	  if (self->allow_gc && !err) {
861	    for (Py_ssize_t i = 0; i < self->n_vars; ++i) {
862	      int do_cleanup = 1;
863	      if (!self->var_has_owner[i] || !self->var_computed[i])
864	        continue;
865	      for (int j = 0; j < self->n_output_vars; ++j) {
866	        if (i == self->output_vars[j]) {
867	          do_cleanup = 0;
868	          break;
869	        }
870	      }
871	      if (!do_cleanup)
872	        continue;
873	      Py_INCREF(Py_None);
874	      PyList_SetItem(self->var_value_cells[i], 0, Py_None);
875	    }
876	  }
877	  if (output_subset != NULL)
878	    free(output_subset);
879	
880	  Py_DECREF(one);
881	  Py_DECREF(zero);
882	  if (err) {
883	    Py_DECREF(rval);
884	    return NULL;
885	  }
886	  return rval;
887	}
888	
889	#if 0
890	static PyMethodDef CLazyLinker_methods[] = {
891	    {
892	      //"name", (PyCFunction)CLazyLinker_accept, METH_VARARGS, "Return the name, combining the first and last name"
893	    },
894	    {NULL}  /* Sentinel */
895	};
896	#endif
897	
898	static PyObject *CLazyLinker_get_allow_gc(CLazyLinker *self, void *closure) {
899	  return PyBool_FromLong(self->allow_gc);
900	}
901	
902	static int CLazyLinker_set_allow_gc(CLazyLinker *self, PyObject *value,
903	                                    void *closure) {
904	  if (!PyBool_Check(value))
905	    return -1;
906	
907	  if (value == Py_True)
908	    self->allow_gc = true;
909	  else
910	    self->allow_gc = false;
911	  return 0;
912	}
913	
914	static PyGetSetDef CLazyLinker_getset[] = {
915	    {(char *)"allow_gc", (getter)CLazyLinker_get_allow_gc,
916	     (setter)CLazyLinker_set_allow_gc,
917	     (char *)"do this function support allow_gc", NULL},
918	    {NULL, NULL, NULL, NULL} /* Sentinel */
919	};
920	static PyMemberDef CLazyLinker_members[] = {
921	    {(char *)"nodes", T_OBJECT_EX, offsetof(CLazyLinker, nodes), 0,
922	     (char *)"list of nodes"},
923	    {(char *)"thunks", T_OBJECT_EX, offsetof(CLazyLinker, thunks), 0,
924	     (char *)"list of thunks in program"},
925	    {(char *)"call_counts", T_OBJECT_EX, offsetof(CLazyLinker, call_counts), 0,
926	     (char *)"number of calls of each thunk"},
927	    {(char *)"call_times", T_OBJECT_EX, offsetof(CLazyLinker, call_times), 0,
928	     (char *)"total runtime in each thunk"},
929	    {(char *)"position_of_error", T_INT,
930	     offsetof(CLazyLinker, position_of_error), 0,
931	     (char *)"position of failed thunk"},
932	    {(char *)"time_thunks", T_INT, offsetof(CLazyLinker, do_timing), 0,
933	     (char *)"bool: nonzero means call will time thunks"},
934	    {(char *)"need_update_inputs", T_INT,
935	     offsetof(CLazyLinker, need_update_inputs), 0,
936	     (char *)"bool: nonzero means Function.__call__ must implement update "
937	             "mechanism"},
938	    {NULL} /* Sentinel */
939	};
940	
941	static PyTypeObject lazylinker_ext_CLazyLinkerType = {
942	    PyVarObject_HEAD_INIT(NULL, 0)
943	
944	        "lazylinker_ext.CLazyLinker",         /*tp_name*/
945	    sizeof(CLazyLinker),                      /*tp_basicsize*/
946	    0,                                        /*tp_itemsize*/
947	    CLazyLinker_dealloc,                      /*tp_dealloc*/
948	    0,                                        /*tp_print*/
949	    0,                                        /*tp_getattr*/
950	    0,                                        /*tp_setattr*/
951	    0,                                        /*tp_compare*/
952	    0,                                        /*tp_repr*/
953	    0,                                        /*tp_as_number*/
954	    0,                                        /*tp_as_sequence*/
955	    0,                                        /*tp_as_mapping*/
956	    0,                                        /*tp_hash */
957	    CLazyLinker_call,                         /*tp_call*/
958	    0,                                        /*tp_str*/
959	    0,                                        /*tp_getattro*/
960	    0,                                        /*tp_setattro*/
961	    0,                                        /*tp_as_buffer*/
962	    Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE, /*tp_flags*/
963	    "CLazyLinker object",                     /* tp_doc */
964	    0,                                        /* tp_traverse */
965	    0,                                        /* tp_clear */
966	    0,                                        /* tp_richcompare */
967	    0,                                        /* tp_weaklistoffset */
968	    0,                                        /* tp_iter */
969	    0,                                        /* tp_iternext */
970	    0,                          // CLazyLinker_methods,       /* tp_methods */
971	    CLazyLinker_members,        /* tp_members */
972	    CLazyLinker_getset,         /* tp_getset */
973	    0,                          /* tp_base */
974	    0,                          /* tp_dict */
975	    0,                          /* tp_descr_get */
976	    0,                          /* tp_descr_set */
977	    0,                          /* tp_dictoffset */
978	    (initproc)CLazyLinker_init, /* tp_init */
979	    0,                          /* tp_alloc */
980	    CLazyLinker_new,            /* tp_new */
981	};
982	
983	static PyObject *get_version(PyObject *dummy, PyObject *args) {
984	  PyObject *result = PyFloat_FromDouble(0.31);
985	  return result;
986	}
987	
988	static PyMethodDef lazylinker_ext_methods[] = {
989	    {"get_version", get_version, METH_VARARGS, "Get extension version."},
990	    {NULL, NULL, 0, NULL} /* Sentinel */
991	};
992	
993	
994	static struct PyModuleDef moduledef = {PyModuleDef_HEAD_INIT,
995	                                       "lazylinker_ext",
996	                                       NULL,
997	                                       -1,
998	                                       lazylinker_ext_methods,
999	                                       NULL,
1000	                                       NULL,
1001	                                       NULL,
1002	                                       NULL};
1003	
1004	PyMODINIT_FUNC PyInit_lazylinker_ext(void) {
1005	
1006	  PyObject *m;
1007	
1008	  lazylinker_ext_CLazyLinkerType.tp_new = PyType_GenericNew;
1009	  if (PyType_Ready(&lazylinker_ext_CLazyLinkerType) < 0)
1010	    return NULL;
1011	
1012	  m = PyModule_Create(&moduledef);
1013	  Py_INCREF(&lazylinker_ext_CLazyLinkerType);
1014	  PyModule_AddObject(m, "CLazyLinker",
1015	                     (PyObject *)&lazylinker_ext_CLazyLinkerType);
1016	
1017	  return m;
1018	}
1019	
===============================
Problem occurred during compilation with the command line below:
"C:\...\msys64\mingw64\bin\g++.EXE" -shared -g -DNPY_NO_DEPRECATED_API=NPY_1_7_API_VERSION -m64 -DMS_WIN64 -I"C:\...\WPy64-31330\python\Lib\site-packages\numpy\_core\include" -I"C:\...\WPy64-31330\python\include" -I"C:\...\WPy64-31330\envs\MYENV\Lib\site-packages\pytensor\link\c\c_code" -L"C:\...\WPy64-31330\python\libs" -L"C:\...\WPy64-31330\python" -o "C:\Users\USERNAME\AppData\Local\PyTensor\compiledir_Windows-11-10.0.22631-SP0-Intel64_Family_6_Model_186_Stepping_2_GenuineIntel-3.13.3-64\lazylinker_ext\lazylinker_ext.pyd" "C:\Users\USERNAME\AppData\Local\PyTensor\compiledir_Windows-11-10.0.22631-SP0-Intel64_Family_6_Model_186_Stepping_2_GenuineIntel-3.13.3-64\lazylinker_ext\mod.cpp" "C:\...\WPy64-31330\python\python313.dll"

@ricardoV94
Copy link
Member

The message is not showing why it failed to compile at all :(

@mvds314
Copy link
Author

mvds314 commented May 30, 2025

I figured it out. A dll called libwinpthread-1.dll is supposed to contain clock_gettime64 as of gcc15 in msys2. Older versions don't have it.
It turns out, this dll was somewhere in my path due to another program, and that version did not contain clock_gettime64. The version shipped with msys2 does. In the end, ensuring that the right dll is found in the path first resolves the problem.

@maresb
Copy link
Contributor

maresb commented May 30, 2025

@ricardoV94
Copy link
Member

Closing this issue, doesn't seem like there's anything for us to do here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working C-backend installation
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants