Skip to content

[AP][GlobalPlacement] Improved Partial Legalizer Legality #2942

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

AlexandreSinger
Copy link
Contributor

@AlexandreSinger AlexandreSinger commented Mar 20, 2025

Updated the partial legalizer to now take into account block types when spreading blocks.

This will create windows around overfilled bins that is aware of which block types are overfilled and how large the window needs to be to accomidate them. It also takes these block types into account when spreading to only allow blocks to spread into sub-windows that they can exist in.

This improves quality but was detrimental to performance, so some performance improvements were needed.

To improve the performance of the partial legalizer, I split the problem into groups of models which must be spread together. This allows us to create tighter windows and can make some parts of the legalizer more efficient. Create a model grouper class which forms the model pack patterns into a graph and find disconnected sub-graphs to form the model groups.

Also improved the window generation by pre-clustering the overfilled bins before creating the windows. This sped up the window generation code since less windows overlap.

Due to the improved flat placement generated by the Global Placer, tuned the APPack parameters to account for this.

vtr_chain largest 9 circuits:

Circuit Wirelength Total Runtime Num CLB
arm_core.v 0.9093321006 1.081643543 0.9941656943
bgm.v 0.9360721953 0.9693323743 1.004008746
mkDelayWorker32B.v 1.000589321 0.9923954373 1.006329114
stereovision0.v 1.03556705 1.230267062 1.004261364
stereovision1.v 0.9325347712 1.247524752 0.9960578187
stereovision2.v 0.9862726051 1.244405247 1.002373887
LU8PEEng.v 0.9510925735 1.088185719 0.9892873777
LU32PEEng.v 0.9960508026 1.066690184 0.9923704993
mcml.v 0.9394928179 1.239699741 1.000296824
Geomean 0.964452035 1.123831231 0.9987784724

Titan:

Circuit WL Total Runtime
LU230_stratixiv_arch_timing.blif 1.027561983 1.418597083
LU_Network_stratixiv_arch_timing.blif 0.9858089215 1.673609251
SLAM_spheric_stratixiv_arch_timing.blif 0.8756827102 1.508746807
bitcoin_miner_stratixiv_arch_timing.blif 0.9558732952 1.162009002
bitonic_mesh_stratixiv_arch_timing.blif 0.9426635 1.348370419
cholesky_bdti_stratixiv_arch_timing.blif 0.9452045991 1.554511681
cholesky_mc_stratixiv_arch_timing.blif 1.001867478 1.723723724
dart_stratixiv_arch_timing.blif 0.9537204738 1.362852268
denoise_stratixiv_arch_timing.blif 1.01567563 1.321576686
des90_stratixiv_arch_timing.blif 0.9443450449 1.312905195
directrf_stratixiv_arch_timing.blif 0.9848683456 1.347394076
gsm_switch_stratixiv_arch_timing.blif 0.9280838909 1.409083922
mes_noc_stratixiv_arch_timing.blif 0.9975245052 1.270281598
minres_stratixiv_arch_timing.blif 0.9347685849 1.64632153
neuron_stratixiv_arch_timing.blif 0.988794427 1.913641423
openCV_stratixiv_arch_timing.blif 1.002306883 1.497142652
segmentation_stratixiv_arch_timing.blif 0.9849441287 1.409029364
sparcT1_chip2_stratixiv_arch_timing.blif 0.9266030893 1.307905593
sparcT1_core_stratixiv_arch_timing.blif 0.8379489165 1.23623315
sparcT2_core_stratixiv_arch_timing.blif 0.9441922526 1.279859778
stap_qrd_stratixiv_arch_timing.blif 1.010116535 1.460021851
stereo_vision_stratixiv_arch_timing.blif 0.9735828011 1.632588476
Geomean  0.9608451674 1.434730844

Largest Koios designs:

  WL Runtime Num CLB
attention_layer.v 0.7228235825 1.036979637 1.012182741
lstm.v 0.9683019709 0.7518383248 1.00862482
dla_like.medium.v 1.021933838 1.213541385 0.992047064
clstm_like.medium.v 1.036250207 1.737207524 1.000784929
bwave_like.fixed.large.v 0.9891537953 1.028456589 0.9838235294
dnnweaver.v 1.030718759 1.061967637 1.00232057
clstm_like.large.v 0.9978387277 1.752006315 1.000306466
bwave_like.float.large.v 0.9522669174 1.133468927 0.9994537012
tpu_like.large.ws.v 0.9428970318 1.030098159 1.00483871
tpu_like.large.os.v 0.9631201262 1.061466133 1.001309758
dla_like.large.v 1.00410154 1.313844874 0.9880947703
tdarknet_like.large.v 1.04231123 1.131325975 0.9984693878
tdarknet_like.small.v 1.056823884 2.103966105 1.001973847
  0.9751754703 1.212099783 0.9995282166

@github-actions github-actions bot added VPR VPR FPGA Placement & Routing Tool lang-cpp C/C++ code labels Mar 20, 2025
@AlexandreSinger
Copy link
Contributor Author

@amin1377 Please review!

@AlexandreSinger AlexandreSinger force-pushed the feature-ap-partial-legalizer branch 2 times, most recently from 240822c to 51bddab Compare March 20, 2025 03:17
@AlexandreSinger AlexandreSinger force-pushed the feature-ap-partial-legalizer branch from 51bddab to fa7e727 Compare March 20, 2025 19:21
Updated the partial legalizer to now take into account block types when
spreading blocks.

This will create windows around overfilled bins that is aware of which
block types are overfilled and how large the window needs to be to
accomodate them. It also takes these block types into account when
spreading to only allow blocks to spread into sub-windows that they can
exist in.

This improves quality but was detremental to performance, so some
performance improvements were needed.

To improve the performance of the partial legalizer, I split the problem
into groups of models which must be spread together. This allows us to
create tighter windows and can make some parts of the legalizer more
efficient. Create a model grouper class which forms the model pack
patterns into a graph and find disconnected sub-graphs to form the model
groups.

Also improved the window generation by pre-clustering the overfilled
bins before creating the windows. This sped up the window generation
code since less windows overlap.
@AlexandreSinger AlexandreSinger force-pushed the feature-ap-partial-legalizer branch from fa7e727 to ce50295 Compare March 20, 2025 19:35
Copy link
Contributor

@amin1377 amin1377 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, Alex!

@AlexandreSinger AlexandreSinger merged commit c9e6075 into verilog-to-routing:master Mar 20, 2025
36 checks passed
@AlexandreSinger AlexandreSinger deleted the feature-ap-partial-legalizer branch March 20, 2025 20:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lang-cpp C/C++ code VPR VPR FPGA Placement & Routing Tool
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants