Skip to content

ODIN crashes when generating netlist statistics #2105

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
aman26kbm opened this issue Jul 24, 2022 · 2 comments
Closed

ODIN crashes when generating netlist statistics #2105

aman26kbm opened this issue Jul 24, 2022 · 2 comments
Assignees

Comments

@aman26kbm
Copy link
Contributor

aman26kbm commented Jul 24, 2022

Expected Behaviour

ODIN shouldn't crash.

Current Behaviour

ODIN is crashing with the TPU 32x32 design.

I created a debug build of VTR and ran the design again to see what was going on. I saw that there was a stack overflow when ODIN was generating statistics of the circuit. From the stack it looked like it was bouncing between lines 256 and 234 in netlist_statistic.cpp many times.

114 #87 0x5602b6bd1490 in get_upward_stat /export/aman/vtr_aman/vtr-verilog-to-routing/ODIN_II/SRC/netlist_statistic.cpp:256
115 #88 0x5602b6bd0ea5 in get_upward_stat /export/aman/vtr_aman/vtr-verilog-to-routing/ODIN_II/SRC/netlist_statistic.cpp:234

These two lines are from these functions:
static metric_t* get_upward_stat(nnet_t* net, netlist_t* netlist, uintptr_t traverse_mark_number)
static metric_t* get_upward_stat(nnode_t* node, netlist_t* netlist, uintptr_t traverse_mark_number)

So, I thought maybe something is causing an infinite loop that keeps these two functions being called back and forth endlessly.

I added some print statements to see if I could get an idea of the node/net in the netlist that is causing this. From the log that was generated after that, I don't see that behavior/pattern. That is, I don't see an infinite loop kinda thing. The tool seems to be progressing normally.

Then I thought maybe the machine just had low memory and so I ran this on a larger machine (125GB RAM). But I see the same behavior there as well.

A few other things to mention:

  1. I ran each individual module in the design and they ran without any error. So, something is wrong with the top level module in the design (where all the submodules are stitched together).

  2. A smaller version of this TPU design (TPU 16x16) passes through the whole flow without any error.

  3. When hard blocks are enabled, then the crash is not seen. The crash is being seen only when I run with hard blocks disabled, with either ODIN or with ODIN+Yosys

Possible Solution

Steps to Reproduce

Design file: https://github.com/aman26kbm/vtr-verilog-to-routing/blob/master/vtr_flow/benchmarks/verilog/koios/tpu_like.medium.ws.v

Arch file: https://github.com/aman26kbm/vtr-verilog-to-routing/blob/master/vtr_flow/arch/COFFE_22nm/k6FracN10LB_mem20K_complexDSP_customSB_22nm.xml

Config file:
Failing:
https://github.com/aman26kbm/vtr-verilog-to-routing/blob/master/vtr_flow/tasks/koios/exp1a_odin_soft/agilex.tpu_like.medium.ws/config/config.txt

Passing:
https://github.com/aman26kbm/vtr-verilog-to-routing/blob/master/vtr_flow/tasks/koios/exp1a_odin_hard/agilex.tpu_like.medium.ws/config/config.txt

Context

Your Environment

  • VTR revision used:
  • Operating System and version:
  • Compiler version:
@aman26kbm
Copy link
Contributor Author

Since this passes with hard blocks enabled, we can reduce the priority of debugging this issue.

@aman26kbm
Copy link
Contributor Author

Duplicate of #2098

@aman26kbm aman26kbm marked this as a duplicate of #2098 Jul 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants