Skip to content

timing_builder: allow dangling combinational nodes #643

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

acomodi
Copy link
Collaborator

@acomodi acomodi commented Jun 10, 2019

Signed-off-by: Alessandro Comodi [email protected]

Description

In the SymbiFlow architectures, some of the tiles do not have combination sinks. This fix is needed to allow the architecture to have no combination sinks for specific tiles. Without this addition, the Timing graph builder would produce a tatum error and the failure of the VPR flow.

It may be better to have this fix under a command line option (e.g. --allow_non_combination_sinks).

Motivation and Context

Avoid Tatum error while building the Timing Graph within VPR.

Types of changes

  • Bug fix (change which fixes an issue)
  • New feature (change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

  • My change requires a change to the documentation
  • I have updated the documentation accordingly
  • I have added tests to cover my changes
  • All new and existing tests passed

@probot-autolabeler probot-autolabeler bot added lang-cpp C/C++ code VPR VPR FPGA Placement & Routing Tool labels Jun 10, 2019
@acomodi
Copy link
Collaborator Author

acomodi commented Jun 12, 2019

@kmurray, @vaughnbetz FYI

@kmurray
Copy link
Contributor

kmurray commented Jun 12, 2019

@acomodi Can you give more context for why this is needed? Usually dangling combinational nodes means something is odd in the netlist, which is why it gets flagged by default.

I'm hesitant to always suppress it, since it often indicates real issues which would otherwise go undetected.

@acomodi
Copy link
Collaborator Author

acomodi commented Jun 12, 2019

@kmurray Sure, I will paste a full log of the error and additional debug files produced when dangling combinational nodes are not allowed.

Probably @litghost can provide further context on this issue.

@acomodi
Copy link
Collaborator Author

acomodi commented Jun 12, 2019

@kmurray if it helps here there are logs of a failing run when dangling combinational nodes are not allowed: logs.zip

@vaughnbetz
Copy link
Contributor

vaughnbetz commented Jun 13, 2019

@acomodi a dangling combinational node should indicate that the netlist has combinational logic that doesn't go anywhere, rather than a tile (physical block) that has no combinational sink.
I looked at the vpr_stdout.log file attached and it ends on these lines:
Build Timing Graph
Build Timing Graph took 0.00 seconds (max_rss 14.1 MiB, delta_rss +0.0 MiB)
Kevin, does dangling combinational logic print an error message (maybe it went to stderr and wasn't captured)?

@kmurray
Copy link
Contributor

kmurray commented Jun 14, 2019

@vaughnbetz I believe the timing graph throws an exception during its consistency check, but the message doesn't appear in the log posted

@acomodi The following warnings look suspicious:

Warning 1: Model 'CE_VCC' output port 'VCC' has no timing specification (no clock specified to create a sequential output port, not combinationally connected to any inputs, not a clock output)
Warning 2: Model 'SR_GND' output port 'GND' has no timing specification (no clock specified to create a sequential output port, not combinationally connected to any inputs, not a clock output)
Warning 3: Model 'NO_FF' input port 'D' has no timing specification (no clock specified to create a sequential input port, not combinationally connected to any outputs, not a clock input)
Warning 4: Model 'NO_DRAM' input port 'A' has no timing specification (no clock specified to create a sequential input port, not combinationally connected to any outputs, not a clock input)
Warning 5: Model 'RAMB18E1_VPR' input port 'ADDRBTIEHIGH' has no timing specification (no clock specified to create a sequential input port, not combinationally connected to any outputs, not a clock input)
Warning 6: Model 'RAMB18E1_VPR' input port 'ADDRATIEHIGH' has no timing specification (no clock specified to create a sequential input port, not combinationally connected to any outputs, not a clock input)
Warning 7: Model 'RAMB36E1_PRIM' input port 'CASCADEINB' has no timing specification (no clock specified to create a sequential input port, not combinationally connected to any outputs, not a clock input)
Warning 8: Model 'RAMB36E1_PRIM' input port 'CASCADEINA' has no timing specification (no clock specified to create a sequential input port, not combinationally connected to any outputs, not a clock input)
Warning 9: Model 'RAMB36E1_PRIM' output port 'CASCADEOUTB' has no timing specification (no clock specified to create a sequential output port, not combinationally connected to any inputs, not a clock output)
Warning 10: Model 'RAMB36E1_PRIM' output port 'CASCADEOUTA' has no timing specification (no clock specified to create a sequential output port, not combinationally connected to any inputs, not a clock output)
Warning 11: Model 'VCC' output port 'VCC' has no timing specification (no clock specified to create a sequential output port, not combinationally connected to any inputs, not a clock output)
Warning 12: Model 'GND' output port 'GND' has no timing specification (no clock specified to create a sequential output port, not combinationally connected to any inputs, not a clock output)

They indicate that there are ports with no timing specifications which means the timing graph builder does not know what to do with them (no edges to add, not marked as sequential ports). As a result the timing graph consistency checks will fail.

I suspect fixing the architecture is a better approach than disabling the timing graph consistency checks universally, since they are aimed at catching these kinds of issues.

@vaughnbetz
Copy link
Contributor

@kmurray Many of these definitely look suspicious and I agree they should be fixed.
Some look like vcc and gnd tie offs. What is your recommended way for specifying their timing? They don't really have a time dependency (they're stable from t = -infinity). A 0 delay edge is not too far off correct (a bit conservative) and a -ve delay edge is probably the most correct if it is allowed (that's what the old vpr classic timing analyzer did for constant generators).

@kmurray
Copy link
Contributor

kmurray commented Jun 14, 2019

Some look like vcc and gnd tie offs. What is your recommended way for specifying their timing?

Historically we've only ever had 'regular' primitives act as constant generators -- that is blocks which otherwise have a 'well formed' timing model (edges/sources etc.) which can be inferred from the section of the architecture.

Whether any pins were constant (e.g. a block with no inputs) was handled as part of the timing constraints. VPR has existing code which detects constant generators from the netlist, and marks them as such in the timing constraints. Any timing nodes marked as constant generators are treated as sources launching data at t = -infinity.

For dedicated constant generators (i.e. tie-off primitives) we could extend the section to allow tagging the ports as such, and update the timing graph builder to just create SOURCE tnodes in those cases. That should ensure the timing graph produced passes all the legality checks within tatum. The existing constant generator inference code should handle tagging those pins as constants in the timing constraints.

However I wonder if it would be possible to model the tie-offs differently which would bypass this issue.
@acomodi Why are the dedicated tie-off primitives needed? Compared, for instance, to using regular constant-generator 0-LUTs.

@litghost
Copy link
Collaborator

litghost commented Jun 16, 2019

@acomodi Why are the dedicated tie-off primitives needed? Compared, for instance, to using regular constant-generator 0-LUTs.

Because the fabric provides constant sources in many places in the graph (including within a site). For example, all LUT pins default to the VCC source by default (and consuming no routing resources). Every INT switch box has it's own dedicate GND source. Internal to sites are tieoff muxes, like the SR_GND and CE_VCC sources. In the case of the DSP block, there a special tieoff block dedicated for the DSP IPINs.

As for the constant-generator 0-LUTs, that is a pretty poor constant solutions, as that LUT gets routed across the entire fabric, consuming routing resources as it goes.

@kmurray
Copy link
Contributor

kmurray commented Jun 18, 2019

After further reflection, I think the long term fix is to properly model tie-offs in the architecture (i.e. #163), and update the packer/router to be aware of them. That would remove the need to create special constant-generator primitives in the netlist as you do now (which is really just a work around for not having #163).

In the short term, making this a new command-line option (i.e. --allow_dangling_combinational_nodes, which defaults to off) likely makes sense. It would maintain the sanity checks by default but would allow the end-user to disable it if they have good reason to.

@litghost
Copy link
Collaborator

litghost commented Jun 18, 2019

After further reflection, I think the long term fix is to properly model tie-offs in the architecture (i.e. #163), and update the packer/router to be aware of them. That would remove the need to create special constant-generator primitives in the netlist as you do now (which is really just a work around for not having #163).

In the XC7 architecture, the tie-offs generally speaking need to be routed. When I implemented the tie-offs as I did, I considered whether how a more specialize approach to #163 would go. In my head, I was unable to justify the complexity of adding explicit modelling support for tie-offs, versus just treating constant nets as nets.

Can you expand on how you would implement #163 that is different from adding blackbox's that are constant sources?

@kmurray
Copy link
Contributor

kmurray commented Jun 18, 2019

Can you expand on how you would implement #163 that is different from adding blackbox's that are constant sources?

In the netlist I would have a 0-LUT for gnd and a 0-LUT for vcc. All primitive pins which are driven by constants would connect to one of those nets. This describes precisely and consistently the logical behaviour desired. Allowing stages which only care about the logical behaviour (e.g. timing graph) to be correct by construction.

With the tie-off modelling of #163 how that logical behaviour is actually implemented is architecture dependent optimization choice.

Pin tie-offs (which are really just additional inputs to the pin's input mux) would be modelled in the RR graph (e.g. intra-cluster RR graph) as additional SOURCE->IPIN connections. If a pin was tie-able to VCC/GND the pin would be connected to the VCC/GND SOURCE node. The standard router can then be invoked with only the VCC/GND source initialized.

The two implementation extremes are:

  1. The architecture supports no tie-offs. This requires the outputs of the 0-LUTs to be routed just like regular signals to all constant primitive pins.
  2. The architecture supports tie-offs at every pin. In this case it's always a single hop from the VCC/GND source to the pin being tied off.

Of course, most architecture would be somewhere in between with some tie-able pins and others which must have constants routed to them. This modelling approach handles the extreme and intermediate cases uniformly in a data-driven way.

@litghost
Copy link
Collaborator

litghost commented Jun 18, 2019

In the netlist I would have a 0-LUT for gnd and a 0-LUT for vcc. All primitive pins which are driven by constants would connect to one of those nets. This describes precisely and consistently the logical behaviour desired. Allowing stages which only care about the logical behaviour (e.g. timing graph) to be correct by construction.

I agree that this is logically correct. However doing this requires VPR explicitly modelling that net as something other than a LUT.

Pin tie-offs (which are really just additional inputs to the pin's input mux) would be modelled in the RR graph (e.g. intra-cluster RR graph) as additional SOURCE->IPIN connections. If a pin was tie-able to VCC/GND the pin would be connected to the VCC/GND SOURCE node. The standard router can then be invoked with only the VCC/GND source initialized.

Almost none of the 7-series tie-offs look like this. There are 4 "https://github.com/verilog-to-routing/vtr-verilog-to-routing/issues/654types" of tie-offs seen:

  • Routable tieoffs that a part of the local interconnect switch/connection box. Using this source requires first class routing. This cannot be approximated by an "IPIN" tieoff, because of how the switch box fans out the constant signal.
  • Site internal tieoffs that are usually muxed with a routed signal internally to the site. These are not particular amendable to the IPIN tieoff model, because of the site routing mux. There is also additional complexity present here, because the packer doesn't have awareness (today) that a net is sourced from a constant source. Consider the following diagram taken from the SLICEM:
    slL1T3z3yVPgBxXdBTN46Cg (2)

I was able to express this construct (this is the CE_VCC blackbox) using the existing architecture.

@acomodi acomodi force-pushed the timing-graph-builder-fix branch from ee8f42a to 619dbca Compare June 19, 2019 10:57
@acomodi
Copy link
Collaborator Author

acomodi commented Jun 19, 2019

@kmurray @litghost For the short term solution I have added the option to allow dangling combinational nodes.

@acomodi acomodi force-pushed the timing-graph-builder-fix branch 2 times, most recently from 4b19a7e to 5f33dda Compare June 20, 2019 21:10
@acomodi
Copy link
Collaborator Author

acomodi commented Jun 24, 2019

@kmurray @vaughnbetz ping

@acomodi acomodi force-pushed the timing-graph-builder-fix branch from 5f33dda to 37fc051 Compare June 24, 2019 15:14
@acomodi acomodi changed the title timing_builder: fixes a tatum error with some symbiflow tests timing_builder: allow dangling combinational nodes Jun 24, 2019
@acomodi acomodi force-pushed the timing-graph-builder-fix branch from 37fc051 to 803d75a Compare July 1, 2019 11:41
@acomodi
Copy link
Collaborator Author

acomodi commented Jul 1, 2019

@kmurray @vaughnbetz Ping

@acomodi acomodi force-pushed the timing-graph-builder-fix branch from 803d75a to 4b57277 Compare July 2, 2019 11:00
@acomodi acomodi force-pushed the timing-graph-builder-fix branch from 4b57277 to 9511de5 Compare July 11, 2019 09:35
@acomodi
Copy link
Collaborator Author

acomodi commented Jul 11, 2019

@kmurray ping. I have rebased with the most recent updates on VTR

@acomodi acomodi force-pushed the timing-graph-builder-fix branch from 9511de5 to e3d8b21 Compare July 19, 2019 13:07
@kmurray kmurray merged commit 1881236 into verilog-to-routing:master Jul 22, 2019
@kmurray
Copy link
Contributor

kmurray commented Jul 22, 2019

Thanks!

@acomodi acomodi deleted the timing-graph-builder-fix branch July 23, 2019 10:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lang-cpp C/C++ code VPR VPR FPGA Placement & Routing Tool
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants