Skip to content

Changing subtile selection in the try_centroid_placement of initial_placement #2897

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
78 changes: 66 additions & 12 deletions vpr/src/place/initial_placement.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -147,6 +147,26 @@ static bool is_loc_legal(const t_pl_loc& loc,
const PartitionRegion& pr,
t_logical_block_type_ptr block_type);

/**
* @brief Helper function to choose a subtile in specified location if compatible and available one exits.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the type is compatible and an available one exists.

*
* @param centroid The centroid location at which the subtile will be selected using its x,y, and layer.
* @param block_type Logical block type of the macro head member.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logical block type we would like to place here

* @param block_loc_registry Placement block location information. To be filled with the location
* where pl_macro is placed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should remove the "To be filled with the location where pl_macro is placed."
I'd just say:
"Information on where other blocks have been placed."

* @param pr The PartitionRegion of the macro head member - represents its floorplanning constraints, is the size of
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd just say:
The PartitionRegion of the block we are trying to place - represents its floorplanning constraints; it is the size of the whole chip if the block is not constrained.

* the whole chip if the macro is not constrained.
* @param rng A random number generator to select subtile from available and compatible ones.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(grammatical nit): a subtile from the available and compatible ones

*
* @return False if location on chip, legal, but no available subtile found. True otherwise. False leads us to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the location is on the chip and legal but no available subtile is found at that location.

I'd delete the "False leads us to neighbour placement currently"

* neighbour placement currently.
*/
static bool find_subtile_in_location(t_pl_loc& centroid,
t_logical_block_type_ptr block_type,
const BlkLocRegistry& blk_loc_registry,
const PartitionRegion& pr,
vtr::RngContainer& rng);

/**
* @brief Calculates a centroid location for a block based on its placed connections.
*
Expand Down Expand Up @@ -339,6 +359,42 @@ static bool is_loc_legal(const t_pl_loc& loc,
return legal;
}

bool find_subtile_in_location(t_pl_loc& centroid,
t_logical_block_type_ptr block_type,
const BlkLocRegistry& blk_loc_registry,
const PartitionRegion& pr,
vtr::RngContainer& rng) {
//check if the location is on chip and legal, if yes try to update subtile
if (is_loc_on_chip({centroid.x, centroid.y, centroid.layer}) && is_loc_legal(centroid, pr, block_type)) {
//finding the subtile location
const auto& device_ctx = g_vpr_ctx.device();
const auto& compressed_block_grid = g_vpr_ctx.placement().compressed_block_grids[block_type->index];
const auto& type = device_ctx.grid.get_physical_type({centroid.x, centroid.y, centroid.layer});
const auto& compatible_sub_tiles = compressed_block_grid.compatible_sub_tile_num(type->index);

//filter out occupied subtiles
const GridBlock& grid_blocks = blk_loc_registry.grid_blocks();
std::vector<int> available_sub_tiles;
available_sub_tiles.reserve(compatible_sub_tiles.size());
for (int sub_tile : compatible_sub_tiles) {
t_pl_loc pos = {centroid.x, centroid.y, sub_tile, centroid.layer};
if (!grid_blocks.block_at_location(pos)) {
available_sub_tiles.push_back(sub_tile);
}
}

//If there is at least one available subtile, update the centroid. Otherwise, sincel location
//is legal and on chip but no subtile found, return false for trying neighbour placement.
if (!available_sub_tiles.empty()) {
centroid.sub_tile = available_sub_tiles[rng.irand((int)available_sub_tiles.size() - 1)];
} else {
return false;
}
}

return true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit strange -- we return true if we find a subtile, or if we don't even try. Could we return true if we found a legal subtile and false otherwise (and change the calling code). It's a more clear interface.

}

static bool find_centroid_neighbor(t_pl_loc& centroid_loc,
t_logical_block_type_ptr block_type,
bool search_for_empty,
Expand Down Expand Up @@ -550,10 +606,15 @@ static bool try_centroid_placement(const t_pl_macro& pl_macro,
t_pl_loc centroid_loc(OPEN, OPEN, OPEN, OPEN);
std::vector<ClusterBlockId> unplaced_blocks_to_update_their_score;

bool try_neighbour_due_to_subtile = false;

if (!flat_placement_info.valid) {
// If a flat placement is not provided, use the centroid of connected
// blocks which have already been placed.
unplaced_blocks_to_update_their_score = find_centroid_loc(pl_macro, centroid_loc, blk_loc_registry);
if(!find_subtile_in_location(centroid_loc, block_type, blk_loc_registry, pr, rng)) {
try_neighbour_due_to_subtile = true;
}
} else {
// If a flat placement is provided, use the flat placement to get the
// centroid.
Expand All @@ -566,6 +627,9 @@ static bool try_centroid_placement(const t_pl_macro& pl_macro,
if (!is_loc_on_chip({centroid_loc.x, centroid_loc.y, centroid_loc.layer}) ||
!is_loc_legal(centroid_loc, pr, block_type)) {
unplaced_blocks_to_update_their_score = find_centroid_loc(pl_macro, centroid_loc, blk_loc_registry);
if(!find_subtile_in_location(centroid_loc, block_type, blk_loc_registry, pr, rng)) {
try_neighbour_due_to_subtile = true;
}
}
}

Expand All @@ -576,9 +640,8 @@ static bool try_centroid_placement(const t_pl_macro& pl_macro,

//centroid suggestion was either occupied or does not match block type
//try to find a near location that meet these requirements
bool neighbor_legal_loc = false;
if (!is_loc_legal(centroid_loc, pr, block_type)) {
neighbor_legal_loc = find_centroid_neighbor(centroid_loc, block_type, false, blk_loc_registry, rng);
if (!is_loc_legal(centroid_loc, pr, block_type) || try_neighbour_due_to_subtile) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can simplify this to if (!found_legal_subtile) // pick whatever variable name makes sense
if you change the find_subtile_in_location routine to return false if it fails to find a valid subtile at the given location for any reason.

It makes that routine more clear, and the logic here more straightforward.

bool neighbor_legal_loc = find_centroid_neighbor(centroid_loc, block_type, false, blk_loc_registry, rng);
if (!neighbor_legal_loc) { //no neighbor candidate found
return false;
}
Expand All @@ -590,15 +653,6 @@ static bool try_centroid_placement(const t_pl_macro& pl_macro,
}

auto& device_ctx = g_vpr_ctx.device();
//choose the location's subtile if the centroid location is legal.
//if the location is found within the "find_centroid_neighbor", it already has a subtile
//we don't need to find one again
if (!neighbor_legal_loc) {
const auto& compressed_block_grid = g_vpr_ctx.placement().compressed_block_grids[block_type->index];
const auto& type = device_ctx.grid.get_physical_type({centroid_loc.x, centroid_loc.y, centroid_loc.layer});
const auto& compatible_sub_tiles = compressed_block_grid.compatible_sub_tile_num(type->index);
centroid_loc.sub_tile = compatible_sub_tiles[rng.irand((int)compatible_sub_tiles.size() - 1)];
}
int width_offset = device_ctx.grid.get_width_offset({centroid_loc.x, centroid_loc.y, centroid_loc.layer});
int height_offset = device_ctx.grid.get_height_offset({centroid_loc.x, centroid_loc.y, centroid_loc.layer});
VTR_ASSERT(width_offset == 0);
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
arch circuit script_params vtr_flow_elapsed_time vtr_max_mem_stage vtr_max_mem error odin_synth_time max_odin_mem parmys_synth_time max_parmys_mem abc_depth abc_synth_time abc_cec_time abc_sec_time max_abc_mem ace_time max_ace_mem num_clb num_io num_memories num_mult vpr_status vpr_revision vpr_build_info vpr_compiler vpr_compiled hostname rundir max_vpr_mem num_primary_inputs num_primary_outputs num_pre_packed_nets num_pre_packed_blocks num_netlist_clocks num_post_packed_nets num_post_packed_blocks device_width device_height device_grid_tiles device_limiting_resources device_name pack_mem pack_time placed_wirelength_est total_swap accepted_swap rejected_swap aborted_swap place_mem place_time place_quench_time placed_CPD_est placed_setup_TNS_est placed_setup_WNS_est placed_geomean_nonvirtual_intradomain_critical_path_delay_est place_delay_matrix_lookup_time place_quench_timing_analysis_time place_quench_sta_time place_total_timing_analysis_time place_total_sta_time ap_mem ap_time ap_full_legalizer_mem ap_full_legalizer_time min_chan_width routed_wirelength min_chan_width_route_success_iteration logic_block_area_total logic_block_area_used min_chan_width_routing_area_total min_chan_width_routing_area_per_tile min_chan_width_route_time min_chan_width_total_timing_analysis_time min_chan_width_total_sta_time crit_path_num_rr_graph_nodes crit_path_num_rr_graph_edges crit_path_collapsed_nodes crit_path_routed_wirelength crit_path_route_success_iteration crit_path_total_nets_routed crit_path_total_connections_routed crit_path_total_heap_pushes crit_path_total_heap_pops critical_path_delay geomean_nonvirtual_intradomain_critical_path_delay setup_TNS setup_WNS hold_TNS hold_WNS crit_path_routing_area_total crit_path_routing_area_per_tile router_lookahead_computation_time crit_path_route_time crit_path_create_rr_graph_time crit_path_create_intra_cluster_rr_graph_time crit_path_tile_lookahead_computation_time crit_path_router_lookahead_computation_time crit_path_total_timing_analysis_time crit_path_total_sta_time
fixed_k6_frac_N8_22nm.xml single_wire.v common 1.48 vpr 75.32 MiB -1 -1 0.06 20584 1 0.02 -1 -1 33044 -1 -1 0 1 0 0 success v8.0.0-12150-gcad6e12c1-dirty release VTR_ASSERT_LEVEL=3 GNU 13.2.0 on Linux-6.8.0-49-generic x86_64 2025-02-10T16:46:28 srivatsan-Precision-Tower-5810 /home/alex/vtr-verilog-to-routing 77128 1 1 0 2 0 1 2 17 17 289 -1 unnamed_device -1 -1 2 3 0 0 3 75.3 MiB 0.54 0.00 0.2714 -0.2714 -0.2714 nan 0.39 7.531e-06 4.665e-06 6.0433e-05 4.1841e-05 75.3 MiB 0.54 75.3 MiB 0.51 8 16 1 6.79088e+06 0 166176. 575.005 0.14 0.00086755 0.000798741 20206 45088 -1 18 1 1 1 141 56 0.7726 nan -0.7726 -0.7726 0 0 202963. 702.294 0.01 0.00 0.04 -1 -1 0.01 0.000788782 0.000736266
fixed_k6_frac_N8_22nm.xml single_ff.v common 1.61 vpr 75.31 MiB -1 -1 0.07 20848 1 0.02 -1 -1 33316 -1 -1 1 2 0 0 success v8.0.0-12150-gcad6e12c1-dirty release VTR_ASSERT_LEVEL=3 GNU 13.2.0 on Linux-6.8.0-49-generic x86_64 2025-02-10T16:46:28 srivatsan-Precision-Tower-5810 /home/alex/vtr-verilog-to-routing 77120 2 1 3 3 1 3 4 17 17 289 -1 unnamed_device -1 -1 22 9 3 2 4 75.3 MiB 0.53 0.00 0.74674 -1.41136 -0.74674 0.74674 0.40 2.2227e-05 1.4289e-05 0.000109603 8.1227e-05 75.3 MiB 0.53 75.3 MiB 0.52 20 31 1 6.79088e+06 13472 414966. 1435.87 0.24 0.000932074 0.000851211 22510 95286 -1 30 1 2 2 142 35 0.74674 0.74674 -1.43836 -0.74674 0 0 503264. 1741.40 0.03 0.00 0.08 -1 -1 0.03 0.000841905 0.000781813
fixed_k6_frac_N8_22nm.xml ch_intrinsics.v common 2.56 vpr 76.11 MiB -1 -1 0.25 21992 3 0.07 -1 -1 36796 -1 -1 32 99 1 0 success v8.0.0-12150-gcad6e12c1-dirty release VTR_ASSERT_LEVEL=3 GNU 13.2.0 on Linux-6.8.0-49-generic x86_64 2025-02-10T16:46:28 srivatsan-Precision-Tower-5810 /home/alex/vtr-verilog-to-routing 77936 99 130 240 229 1 229 262 17 17 289 -1 unnamed_device -1 -1 922 19536 1095 3280 15161 76.1 MiB 0.68 0.00 1.90502 -125.031 -1.90502 1.90502 0.40 0.000671414 0.000596121 0.0173508 0.0153886 76.1 MiB 0.68 76.1 MiB 0.67 32 2029 19 6.79088e+06 979104 586450. 2029.24 0.49 0.0967068 0.0858868 24814 144142 -1 1797 13 568 837 56998 16456 1.9213 1.9213 -139.939 -1.9213 -0.21204 -0.16867 744469. 2576.02 0.04 0.03 0.12 -1 -1 0.04 0.0364713 0.0327019
fixed_k6_frac_N8_22nm.xml diffeq1.v common 17.46 vpr 78.03 MiB -1 -1 0.39 26984 15 0.32 -1 -1 37600 -1 -1 47 162 0 5 success v8.0.0-12150-gcad6e12c1-dirty release VTR_ASSERT_LEVEL=3 GNU 13.2.0 on Linux-6.8.0-49-generic x86_64 2025-02-10T16:46:28 srivatsan-Precision-Tower-5810 /home/alex/vtr-verilog-to-routing 79904 162 96 817 258 1 740 310 17 17 289 -1 unnamed_device -1 -1 7095 27558 418 7907 19233 78.0 MiB 1.56 0.01 21.5089 -1654.97 -21.5089 21.5089 0.41 0.00201386 0.0017781 0.0639353 0.0570499 78.0 MiB 1.56 78.0 MiB 1.06 66 13667 30 6.79088e+06 2.61318e+06 1.11570e+06 3860.55 12.67 1.01308 0.910151 31150 283249 -1 12086 16 3735 9584 1188962 281307 20.677 20.677 -1595.68 -20.677 0 0 1.39736e+06 4835.16 0.07 0.27 0.27 -1 -1 0.07 0.15436 0.14041
arch circuit script_params vtr_flow_elapsed_time vtr_max_mem_stage vtr_max_mem error odin_synth_time max_odin_mem parmys_synth_time max_parmys_mem abc_depth abc_synth_time abc_cec_time abc_sec_time max_abc_mem ace_time max_ace_mem num_clb num_io num_memories num_mult vpr_status vpr_revision vpr_build_info vpr_compiler vpr_compiled hostname rundir max_vpr_mem num_primary_inputs num_primary_outputs num_pre_packed_nets num_pre_packed_blocks num_netlist_clocks num_post_packed_nets num_post_packed_blocks device_width device_height device_grid_tiles device_limiting_resources device_name pack_mem pack_time placed_wirelength_est total_swap accepted_swap rejected_swap aborted_swap place_mem place_time place_quench_time placed_CPD_est placed_setup_TNS_est placed_setup_WNS_est placed_geomean_nonvirtual_intradomain_critical_path_delay_est place_delay_matrix_lookup_time place_quench_timing_analysis_time place_quench_sta_time place_total_timing_analysis_time place_total_sta_time ap_mem ap_time ap_full_legalizer_mem ap_full_legalizer_time min_chan_width routed_wirelength min_chan_width_route_success_iteration logic_block_area_total logic_block_area_used min_chan_width_routing_area_total min_chan_width_routing_area_per_tile min_chan_width_route_time min_chan_width_total_timing_analysis_time min_chan_width_total_sta_time crit_path_num_rr_graph_nodes crit_path_num_rr_graph_edges crit_path_collapsed_nodes crit_path_routed_wirelength crit_path_route_success_iteration crit_path_total_nets_routed crit_path_total_connections_routed crit_path_total_heap_pushes crit_path_total_heap_pops critical_path_delay geomean_nonvirtual_intradomain_critical_path_delay setup_TNS setup_WNS hold_TNS hold_WNS crit_path_routing_area_total crit_path_routing_area_per_tile router_lookahead_computation_time crit_path_route_time crit_path_create_rr_graph_time crit_path_create_intra_cluster_rr_graph_time crit_path_tile_lookahead_computation_time crit_path_router_lookahead_computation_time crit_path_total_timing_analysis_time crit_path_total_sta_time
fixed_k6_frac_N8_22nm.xml single_wire.v common 2.25 vpr 75.57 MiB -1 -1 0.11 20616 1 0.02 -1 -1 33172 -1 -1 0 1 0 0 success v8.0.0-12163-g0dba7016b-dirty Release VTR_ASSERT_LEVEL=2 GNU 11.4.0 on Linux-6.8.0-51-generic x86_64 2025-02-19T17:54:19 haydar-Precision-5820-Tower /home/haydar/vtr-verilog-to-routing 77384 1 1 0 2 0 1 2 17 17 289 -1 unnamed_device -1 -1 2 3 0 0 3 75.6 MiB 0.82 0.00 0.2714 -0.2714 -0.2714 nan 0.60 1.0195e-05 5.861e-06 7.0627e-05 4.5591e-05 75.6 MiB 0.82 75.6 MiB 0.78 8 16 1 6.79088e+06 0 166176. 575.005 0.22 0.0015764 0.00149137 20206 45088 -1 18 1 1 1 141 56 0.7726 nan -0.7726 -0.7726 0 0 202963. 702.294 0.02 0.00 0.06 -1 -1 0.02 0.00154357 0.00147507
fixed_k6_frac_N8_22nm.xml single_ff.v common 2.07 vpr 75.57 MiB -1 -1 0.11 21004 1 0.02 -1 -1 33328 -1 -1 1 2 0 0 success v8.0.0-12163-g0dba7016b-dirty Release VTR_ASSERT_LEVEL=2 GNU 11.4.0 on Linux-6.8.0-51-generic x86_64 2025-02-19T17:54:19 haydar-Precision-5820-Tower /home/haydar/vtr-verilog-to-routing 77388 2 1 3 3 1 3 4 17 17 289 -1 unnamed_device -1 -1 22 9 3 1 5 75.6 MiB 0.70 0.00 0.74674 -1.4524 -0.74674 0.74674 0.55 1.7604e-05 1.0829e-05 0.000109132 7.4428e-05 75.6 MiB 0.70 75.6 MiB 0.69 20 27 1 6.79088e+06 13472 414966. 1435.87 0.36 0.00134255 0.00124027 22510 95286 -1 26 1 2 2 102 24 0.691615 0.691615 -1.31306 -0.691615 0 0 503264. 1741.40 0.04 0.00 0.12 -1 -1 0.04 0.00165403 0.00156635
fixed_k6_frac_N8_22nm.xml ch_intrinsics.v common 2.78 vpr 76.11 MiB -1 -1 0.25 22288 3 0.07 -1 -1 36924 -1 -1 32 99 1 0 success v8.0.0-12163-g0dba7016b-dirty Release VTR_ASSERT_LEVEL=2 GNU 11.4.0 on Linux-6.8.0-51-generic x86_64 2025-02-19T17:54:19 haydar-Precision-5820-Tower /home/haydar/vtr-verilog-to-routing 77936 99 130 240 229 1 229 262 17 17 289 -1 unnamed_device -1 -1 883 19536 1068 3887 14581 76.1 MiB 0.67 0.00 1.86512 -124.45 -1.86512 1.86512 0.39 0.000560506 0.000504742 0.0161729 0.0146716 76.1 MiB 0.67 76.1 MiB 0.66 32 1890 11 6.79088e+06 979104 586450. 2029.24 0.47 0.0893208 0.0810973 24814 144142 -1 1712 13 543 802 57386 17520 1.9213 1.9213 -143.517 -1.9213 -0.04337 -0.04337 744469. 2576.02 0.06 0.05 0.19 -1 -1 0.06 0.053443 0.048647
fixed_k6_frac_N8_22nm.xml diffeq1.v common 12.29 vpr 77.89 MiB -1 -1 0.61 27152 15 0.49 -1 -1 38004 -1 -1 47 162 0 5 success v8.0.0-12163-g0dba7016b-dirty Release VTR_ASSERT_LEVEL=2 GNU 11.4.0 on Linux-6.8.0-51-generic x86_64 2025-02-19T17:54:19 haydar-Precision-5820-Tower /home/haydar/vtr-verilog-to-routing 79760 162 96 817 258 1 740 310 17 17 289 -1 unnamed_device -1 -1 7006 24414 236 6771 17407 77.9 MiB 1.84 0.01 21.8698 -1649.28 -21.8698 21.8698 0.44 0.00183251 0.0016749 0.0693904 0.0634239 77.9 MiB 1.84 77.9 MiB 1.10 60 14847 46 6.79088e+06 2.61318e+06 1.01997e+06 3529.29 6.71 1.05045 0.971299 29998 257685 -1 12402 16 3793 9643 1173029 292327 21.3427 21.3427 -1635.12 -21.3427 0 0 1.27783e+06 4421.56 0.06 0.28 0.21 -1 -1 0.06 0.146101 0.135935
Loading
Loading