Skip to content

[APPack] Updated How APPack Adheres to Given Placement #2934

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

AlexandreSinger
Copy link
Contributor

The original implementation of APPack was focused on reconstructing a given flat placement. This can cause issues if the given flat placement disagrees with the decisions of the packer.

Instead, updated APPack so that it treats the flat placement as a hint to help guide how it performs clustering.

Added the following new features:

  • APPack computes the location of clusters based on the centroid of the molecules packed within.
  • APPack attenuates the gain terms of candidates based on their distance from the cluster.
  • APPack drops candidates which are too far from the cluster being created.

Remove adding molecules near to the position of the cluster. This had similar affects to unrelated clustering and should be investigated separately later.

With these changes to APPack, the AP flow now improves WL of circuits by 1-3% at the expense of up to 15% runtime compared to the default VPR flow.

Results of the AP flow after this change compared to VPR with default options and timing analysis turned off:

9 Largest vtr_chain circuits:

circuit Wirelength Normalized to Baseline Total Runtime Normalized to Baseline
arm_core.v 0.9068062897 1.109925293
bgm.v 0.9382499659 0.8754332047
mkDelayWorker32B.v 0.9958747495 0.9211026616
stereovision0.v 1.013203951 1.125222552
stereovision1.v 0.9929635368 1.135148515
stereovision2.v 0.9711905681 1.02623746
LU8PEEng.v 0.9786887776 0.9161814281
LU32PEEng.v 0.964489841 0.9770173737
mcml.v 0.9673598041 1.164516729
GEOMEAN 0.9698697205 1.022610775

Titan circuits:

circuit Wirelength Normalized to Baseline Total Runtime Normalized to Baseline
LU230_stratixiv_arch_timing.blif 0.9793564911 1.122268882
LU_Network_stratixiv_arch_timing.blif 0.9921182958 1.125048297
SLAM_spheric_stratixiv_arch_timing.blif 0.9477687294 1.24510411
bitcoin_miner_stratixiv_arch_timing.blif 0.9804838597 1.056503177
bitonic_mesh_stratixiv_arch_timing.blif 0.9753104841 1.116877439
cholesky_bdti_stratixiv_arch_timing.blif 0.9894022306 1.178353616
cholesky_mc_stratixiv_arch_timing.blif 0.9956843444 1.236407139
dart_stratixiv_arch_timing.blif 1.002334218 1.147268208
denoise_stratixiv_arch_timing.blif 0.9847509245 1.055015973
des90_stratixiv_arch_timing.blif 0.99455778 1.095323637
directrf_stratixiv_arch_timing.blif 1.028484778 1.143643758
gsm_switch_stratixiv_arch_timing.blif 0.9718116921 1.157820455
mes_noc_stratixiv_arch_timing.blif 1.011588214 1.148153733
minres_stratixiv_arch_timing.blif 0.973136625 1.161735052
neuron_stratixiv_arch_timing.blif 0.9846669803 1.178821502
openCV_stratixiv_arch_timing.blif 0.9903253185 1.119551078
segmentation_stratixiv_arch_timing.blif 0.9665884718 1.125989852
sparcT1_chip2_stratixiv_arch_timing.blif 0.9754410083 1.142591003
sparcT1_core_stratixiv_arch_timing.blif 0.8786965928 1.07788226
sparcT2_core_stratixiv_arch_timing.blif 0.949142042 1.092642298
stap_qrd_stratixiv_arch_timing.blif 1.070254082 1.152198384
stereo_vision_stratixiv_arch_timing.blif 0.9459168542 1.1506696
GEOMEAN 0.9812645462 1.137721339

Koios circuits:

circuit Wirelength Normalized to Baseline Total Runtime Normalized to Baseline
attention_layer.v 0.9651406788 0.9976654765
bnn.v 0.9698804243 1.00138961
bwave_like.fixed.large.v 0.9716349158 0.9717420873
bwave_like.fixed.small.v 1.0348547 0.8151621286
bwave_like.float.large.v 0.9450905059 1.025431804
bwave_like.float.small.v 1.003100325 1.002006421
clstm_like.large.v 0.9857700805 1.229624392
clstm_like.medium.v 1.041169108 1.093515828
clstm_like.small.v 1.055022773 1.125005975
conv_layer.v 1.01150055 1.072230911
conv_layer_hls.v 0.9155518973 0.9496418186
dla_like.large.v 0.9728111019 1.107611749
dla_like.medium.v 1.009068249 1.066332359
dla_like.small.v 0.993846884 0.9974674935
dnnweaver.v 1.010385618 1.010849811
eltwise_layer.v 1.02550042 1.027436459
gemm_layer.v 0.9858456208 1.000940291
lenet.v 0.9886877377 1.08042655
lstm.v 0.9655769287 0.8813570327
reduction_layer.v 1.074968315 1.082794706
robot_rl.v 0.9951012262 1.077313228
softmax.v 1.011888208 0.939142462
spmv.v 1.005323373 1.01993865
tdarknet_like.large.v 1.028320352 1.00433856
tdarknet_like.small.v 1.091996725 2.017599895
tpu_like.large.os.v 0.982005703 0.9786696848
tpu_like.large.ws.v 0.987011726 0.9513147002
tpu_like.small.os.v 1.019433773 1.084353741
tpu_like.small.ws.v 0.907651905 0.9695574454
GEOMEAN 0.9984186146 1.054512458

These results are surprisingly good given that the partial legalizer does not take into account block types.

@AlexandreSinger
Copy link
Contributor Author

@amin1377 FYI

@github-actions github-actions bot added VPR VPR FPGA Placement & Routing Tool lang-cpp C/C++ code labels Mar 15, 2025
Copy link
Contributor

@vaughnbetz vaughnbetz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments, but LGTM -- merge whenever you like.

The original implementation of APPack was focused on reconstructing a
given flat placement. This can cause issues if the given flat placement
disagrees with the decisions of the packer.

Instead, updated APPack so that it treats the flat placement as a hint
to help guide how it performs clustering.

Added the following new features:
- APPack computes the location of clusters based on the centroid of the
  molecules packed within.
- APPack attenuates the gain terms of candidates based on their distance
  from the cluster.
- APPack drops candidates which are too far from the cluster being
  created.

Remove adding molecules near to the position of the cluster. This had
similar affects to unrelated clustering and should be investigated
separately later.

With these changes to APPack, the AP flow now improves WL of circuits by
1-3% at the expense of up to 15% runtime compared to the default VPR
flow.
@AlexandreSinger AlexandreSinger merged commit ab25381 into verilog-to-routing:master Mar 19, 2025
36 checks passed
@AlexandreSinger AlexandreSinger deleted the feature-appack branch March 19, 2025 12:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lang-cpp C/C++ code VPR VPR FPGA Placement & Routing Tool
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants