[AP][InitialPlacement] Created Isolated AP Flow #2988

AlexandreSinger · 2025-04-17T22:19:05Z

The old Initial Placer used in the AP flow was constructed within the initial placer of the non-AP flow. This forced the AP flow to try to place blocks one at a time with minimum displacement. This is non-ideal since blocks that were placed earlier were being getting first picks at locations, which may displace a future cluster which may be a better fit for that location.

Separated out the AP initial placement code. For AP, initial placement is done in passes.

The first pass will try to place clusters exactly at the tile that the centroid of all atoms within the cluster want to be placed (according to the global placement). Any clusters that could not be placed are reserved for the next pass.

The second pass will allow clusters to be placed within 1 tile of their centroid.

All subsequent passes will allow cluster to be placed exponentially farther from their centroid.

The initial placement terminates when all clusters have been placed or if the max displacement is the size of the entire device.

The clusters are sorted based on the size of the macro that contains them and the variance of the placement of the atoms within the macro. This allows large macro blocks with low variance to be placed first.

Results on the largest VTR circuits (fixed IOs):

Metric	Change
Normalized Post FL WL	0.947
Normalized Post-Route WL	1.003
Normalized Atom Errors	0.922
Normalized Atom Displacement	0.950
Normalized Max Atom Displacement	1.047

This improved the initial placement quality by around 5%, the amount of atom errors (where atoms are placed in a tile they do not want to be placed in according to the global placement) went down by around 8%. Atom displacement improved by 5%. The max atom displacement got worst by around 5%. I think this increase in max displacement is ok. I will collect Titan results to verify this.

AlexandreSinger · 2025-04-17T22:34:32Z

The output status messages look like this:

AlexandreSinger · 2025-04-17T22:36:32Z

@amin1377 FYI

AlexandreSinger · 2025-04-19T20:06:15Z

Titan results are in:

Metric	Improvement over baseline
Post FL HPWL	1.0033
Post Route HPWL	0.9968
Atom Errors	0.9470
Average Atom Displacement	1.0527
Max Atom Displacement	1.0657

This change had very little change on the titan results; however more atoms appear to be placed where they wanted to go according to the global placement. Looking through the raw data, I do notice a lot of outliers which are bringing up a lot of these numbers. I think this may be related to the mass legalizer not exactly knowing how much can be put into different clusters.

vaughnbetz

Nice clean code! One comment on fixed IOs / placement constraints embedded.

vaughnbetz · 2025-04-19T21:04:41Z

vpr/src/place/initial_placement.cpp

+        float variance = get_flat_variance(pl_macro, flat_placement_info);
+        float std_dev = std::sqrt(variance);
+        // Normalize the standard deviation to be a number between 0 and 1.
+        float normalized_std_dev = std_dev / (std_dev + 1.0f);


This function will have relatively little difference for most std_dev values; it has a reasonable difference for small standard deviations (e.g. 1 maps to .5 and 2 to .67, but not much for larger ones (50 maps to .98 and 100 to .99). Is that what you want?

This was somewhat intentional. I tried a few different normalization functions and found this one to work well. My intuition behind it is that we want variances close to 0 to be placed first; anything beyond around 5 have such a high variance that it implies that the atoms within do not know where they want to go.

The goal of this cost term is to ensure that clusters with variance 0 get placed first. Eventually I would like to add some mass information so we try to place more "massive" clusters first (clusters with the most pins / number of RAM bits).

vaughnbetz · 2025-04-19T21:36:35Z

vpr/src/place/initial_placement.cpp

+
+            // Finally, fix the IO blocks if the user specified the option to do
+            // so.
+            fix_IO_block_types(pl_macro, centroid_loc, pad_loc_type, blk_loc_registry.mutable_block_locs());


I'd beef up this comment a bit more ... if the user asked for a random pad location and this is an IO block, lock down the macro at this location so the placer can't move it.

I also think this code isn't quite right -- locking the IOs is done to test how well you work with randomly locked IOs, or IOs locked to specific locations by board level constraints and expressed with a pad_file that locks them down. This code seems to put the IOs where it wants, and then lock them down (unless the flat placement already put them in the right spot). If the latter, that should be commented. If the former, there should be a TODO to fix this eventually; it is a form of placement constraint, so supporting placement constraints should eventually fix it.

Hi Vaughn, I have updated the comment to beef it up.

The current implementation of the non-AP initial placer does this; whenever it find a legal site to place a macro it tries to lock them down using this method (the method checks if the option was passed in). This matches what the fix-pins option expects:

The IO pads are fixed to "arbitrary" locations (when the IOs are not fixed, AP GP will place them basically anywhere it wants). However, I see what you mean since this is not truly "random" since AP is choosing good sites for these blocks. I added a TODO to investigate this.

amin1377 · 2025-04-21T12:28:50Z

Hi @AlexandreSinger,

I have a quick question. In this part of the code, and elsewhere in AP, when block ordering is needed, you always sort based on the size of the macro. That makes perfect sense for the ASIC flow. However, for FPGA placement, I wonder if it might be better to sort based on the number of pins.

I realize there's likely a high correlation between macro size and pin count, but making the sorting criteria explicitly based on pins might result in a better QoR.

AlexandreSinger · 2025-04-21T13:58:17Z

Hi @AlexandreSinger,

I have a quick question. In this part of the code, and elsewhere in AP, when block ordering is needed, you always sort based on the size of the macro. That makes perfect sense for the ASIC flow. However, for FPGA placement, I wonder if it might be better to sort based on the number of pins.

I realize there's likely a high correlation between macro size and pin count, but making the sorting criteria explicitly based on pins might result in a better QoR.

Hi @amin1377 , I am not sure what you mean by "elsewhere in AP", this is the only place in the AP flow that I sort by the size of the macro as far as I am aware. Where else do I do this?

Regarding my use of it here; it has to do with finding a legal placement. Suppose we had a macro with 2 clusters in it. This macro will be harder to place than just a single cluster since it would need to find two open, legal clusters right next to each other (which gets harder and harder to find as the macro size increases). If we place these large macros too late, they may never find a legal site to be placed into. That was the intuition of sorting based on macro size; it was not necessarily for finding better quality solutions.

I am currently working on the mass abstraction; I think sorting based on the mass of all blocks within each macro would achieve what you are describing. We could add this to the cost function, but that would have to come in later though.

The old Initial Placer used in the AP flow was constructed within the initial placer of the non-AP flow. This forced the AP flow to try to place blocks one at a time with minimum displacement. This is non-ideal since blocks that were placed earlier were being getting first picks at locations, which may displace a future cluster which may be a better fit for that location. Separated out the AP initial placement code. For AP, initial placement is done in passes. The first pass will try to place clusters exactly at the tile that the centroid of all atoms within the cluster want to be placed (according to the global placement). Any clusters that could not be placed are reserved for the next pass. The second pass will allow clusters to be placed within 1 tile of their centroid. All subsequent passes will allow cluster to be placed exponentially farther from their centroid. The initial placement terminates when all clusters have been placed or if the max displacement is the size of the entire device. The clusters are sorted based on the size of the macro that contains them and the variance of the placement of the atoms within the macro. This allows large macro blocks with low variance to be placed first.

github-actions bot added VPR VPR FPGA Placement & Routing Tool lang-cpp C/C++ code labels Apr 17, 2025

AlexandreSinger force-pushed the feature-ap-initial-placer branch from f57750e to ce2b5a0 Compare April 17, 2025 22:35

AlexandreSinger requested a review from vaughnbetz April 17, 2025 22:36

vaughnbetz approved these changes Apr 19, 2025

View reviewed changes

AlexandreSinger force-pushed the feature-ap-initial-placer branch from ce2b5a0 to dfa1bd3 Compare April 21, 2025 17:44

AlexandreSinger merged commit 3663572 into verilog-to-routing:master Apr 21, 2025
35 checks passed

AlexandreSinger deleted the feature-ap-initial-placer branch April 21, 2025 19:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AP][InitialPlacement] Created Isolated AP Flow #2988

[AP][InitialPlacement] Created Isolated AP Flow #2988

AlexandreSinger commented Apr 17, 2025 •

edited

Loading

AlexandreSinger commented Apr 17, 2025

AlexandreSinger commented Apr 17, 2025

AlexandreSinger commented Apr 19, 2025

vaughnbetz left a comment

vaughnbetz Apr 19, 2025

AlexandreSinger Apr 21, 2025

vaughnbetz Apr 19, 2025

AlexandreSinger Apr 21, 2025

amin1377 commented Apr 21, 2025

AlexandreSinger commented Apr 21, 2025

[AP][InitialPlacement] Created Isolated AP Flow #2988

[AP][InitialPlacement] Created Isolated AP Flow #2988

Conversation

AlexandreSinger commented Apr 17, 2025 • edited Loading

AlexandreSinger commented Apr 17, 2025

AlexandreSinger commented Apr 17, 2025

AlexandreSinger commented Apr 19, 2025

vaughnbetz left a comment

Choose a reason for hiding this comment

vaughnbetz Apr 19, 2025

Choose a reason for hiding this comment

AlexandreSinger Apr 21, 2025

Choose a reason for hiding this comment

vaughnbetz Apr 19, 2025

Choose a reason for hiding this comment

AlexandreSinger Apr 21, 2025

Choose a reason for hiding this comment

amin1377 commented Apr 21, 2025

AlexandreSinger commented Apr 21, 2025

AlexandreSinger commented Apr 17, 2025 •

edited

Loading