Skip to content

[AP][Timing] Added Basic Net Weighting #2969

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

AlexandreSinger
Copy link
Contributor

Added basic timing awareness to the AP flow by weighting nets in the AP solver by their criticality (the max criticality of all edges through that net). This makes the solver try to minimize the length of nets that are more critical more than nets that are less critical (according to the pre-clustering timing analyzer).

Added a command-line option to trade-off between timing and wirelength in the AP flow.

@github-actions github-actions bot added VPR VPR FPGA Placement & Routing Tool docs Documentation lang-cpp C/C++ code labels Apr 10, 2025
Added basic timing awareness to the AP flow by weighting nets in the AP
solver by their criticality (the max criticality of all edges through
that net). This makes the solver try to minimize the length of nets that
are more critical more than nets that are less critical (according to
the pre-clustering timing analyzer).

Added a command-line option to tradeoff between timing and wirelength in
the AP flow.
@AlexandreSinger
Copy link
Contributor Author

@vaughnbetz @amin1377 Good news regarding the timing-aware AP flow.

Results on Titan (fixed IOs, 1.3 * MinW):

  CPD WL
No AP 1.000 1.000
AP tradeoff=0.0 0.994 0.952
AP tradeoff=0.5 0.963 0.950
AP tradeoff=1.0 0.999 0.937

Note: 1 circuit (sparcT1_chip2) failed to pack for the AP flow at tradeoff=0.0 and 0.5 due to a known issue with the IO blocks. Not all IO blocks were fixed, and the max candidate distance may have been set too low for IO blocks. For tradeoff=1.0, gsm_switch failed to route.

AP tradeoff=0.0 is the AP flow with no net weighting (it is what the AP flow would produce before this PR). With this new net weighting scheme (with a sensible tradeoff), we can see a 4.7% improvement in CPD and a 5% improvement in WL on Titan (with higher run time; this change did not improve run time)!

Increasing the tradeoff to be 1.0 makes you lose most of your gains. This is likely because the objectives of the global placer does not align with the full legalizer (0.75 tradeoff by default) or the detailed placer (0.5 tradeoff by default).

Per-circuit results for AP tradeoff=0.5, normalized to non-AP:

  CPD WL Run time
LU230_stratixiv_arch_timing.blif 0.979 0.947 1.189
LU_Network_stratixiv_arch_timing.blif 0.904 0.974 1.367
SLAM_spheric_stratixiv_arch_timing.blif 0.961 0.885 1.349
bitcoin_miner_stratixiv_arch_timing.blif 0.839 0.977 1.126
bitonic_mesh_stratixiv_arch_timing.blif 0.989 1.008 1.391
cholesky_bdti_stratixiv_arch_timing.blif 0.984 0.945 1.370
cholesky_mc_stratixiv_arch_timing.blif 0.928 0.948 1.351
dart_stratixiv_arch_timing.blif 1.086 0.915 1.438
denoise_stratixiv_arch_timing.blif 0.974 0.983 1.285
des90_stratixiv_arch_timing.blif 0.964 0.929 1.316
directrf_stratixiv_arch_timing.blif 0.941 0.888 1.120
gsm_switch_stratixiv_arch_timing.blif 1.008 1.053 1.393
mes_noc_stratixiv_arch_timing.blif 0.789 0.953 1.329
minres_stratixiv_arch_timing.blif 1.001 0.961 1.421
neuron_stratixiv_arch_timing.blif 0.998 1.005 1.730
openCV_stratixiv_arch_timing.blif 1.118 0.978 1.435
segmentation_stratixiv_arch_timing.blif 0.998 0.967 1.445
sparcT1_chip2_stratixiv_arch_timing.blif Pack Failure Pack Failure Pack Failure
sparcT1_core_stratixiv_arch_timing.blif 0.898 0.876 1.513
sparcT2_core_stratixiv_arch_timing.blif 0.990 0.869 1.329
stap_qrd_stratixiv_arch_timing.blif 0.940 0.932 1.409
stereo_vision_stratixiv_arch_timing.blif 0.995 0.977 1.764
       
Geomean 0.963 0.950 1.376

As you can see, some of the circuits' CPD have gone up by as much as 12%, while WL remained mostly below 1.00. One circuit (gsm_switch) was worst in CPD and slightly worst in WL.

@vaughnbetz
Copy link
Contributor

Excellent! Interesting result at timing trade off of 1.

@AlexandreSinger
Copy link
Contributor Author

Excellent! Interesting result at timing trade off of 1.

Yeah I agree, I was not expecting the WL to improve while the CPD to get worse (yet still be better than non-AP). Perhaps the AP flow completely ignoring certain non-critical nets are causing the more critical nets to become shorter; however, this causes the originally non-critical nets to then become critical. This can probably be fixed by improving the accuracy of the pre-packed timing analysis using AP info.

Copy link
Contributor

@vaughnbetz vaughnbetz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@vaughnbetz vaughnbetz merged commit d6fe4b5 into verilog-to-routing:master Apr 11, 2025
36 checks passed
@amin1377
Copy link
Contributor

Excellent! Interesting result at timing trade off of 1.

Not sure if it’s relevant, but for the default placement, in a set of benchmarks I tested a while ago, setting the timing trade-off to 1 actually resulted in worse CPD, while wirelength improved slightly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Documentation lang-cpp C/C++ code VPR VPR FPGA Placement & Routing Tool
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants