Skip to content
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.

Commit ebc56a3

Browse files
committedApr 4, 2025··
[Router] Updated Command-Line Usage for Parallel Connection Router
Updated the command-line usage for parallel connection router in both Read the Docs and read_options.cpp.
1 parent 7e499c7 commit ebc56a3

File tree

2 files changed

+103
-32
lines changed

2 files changed

+103
-32
lines changed
 

‎doc/src/vpr/command_line_usage.rst

Lines changed: 101 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -47,12 +47,12 @@ By default VPR will perform a binary search routing to find the minimum channel
4747

4848
Detailed Command-line Options
4949
-----------------------------
50-
VPR has a lot of options. Running :option:`vpr --help` will display all the available options and their usage information.
50+
VPR has a lot of options. Running :option:`vpr --help` will display all the available options and their usage information.
5151

5252
.. option:: -h, --help
5353

5454
Display help message then exit.
55-
55+
5656
The options most people will be interested in are:
5757

5858
* :option:`--route_chan_width` (route at a fixed channel width), and
@@ -208,7 +208,7 @@ General Options
208208
* Any string matching ``name`` attribute of a device layout defined with a ``<fixed_layout>`` tag in the :ref:`arch_grid_layout` section of the architecture file.
209209

210210
If the value specified is neither ``auto`` nor matches the ``name`` attribute value of a ``<fixed_layout>`` tag, VPR issues an error.
211-
211+
212212
.. note:: If the only layout in the architecture file is a single device specified using ``<fixed_layout>``, it is recommended to always specify the ``--device`` option; this prevents the value ``--device auto`` from interfering with operations supported only for ``<fixed_layout>`` grids.
213213

214214
**Default:** ``auto``
@@ -892,7 +892,7 @@ If any of init_t, exit_t or alpha_t is specified, the user schedule, with a fixe
892892

893893
.. option:: --place_agent_algorithm {e_greedy | softmax}
894894

895-
Controls which placement RL agent is used.
895+
Controls which placement RL agent is used.
896896

897897
**Default:** ``softmax``
898898

@@ -914,10 +914,10 @@ If any of init_t, exit_t or alpha_t is specified, the user schedule, with a fixe
914914

915915
.. option:: --place_reward_fun {basic | nonPenalizing_basic | runtime_aware | WLbiased_runtime_aware}
916916

917-
The reward function used by the placement RL agent to learn the best action at each anneal stage.
917+
The reward function used by the placement RL agent to learn the best action at each anneal stage.
918+
919+
.. note:: The latter two are only available for timing-driven placement.
918920

919-
.. note:: The latter two are only available for timing-driven placement.
920-
921921
**Default:** ``WLbiased_runtime_aware``
922922

923923
.. option:: --place_agent_space {move_type | move_block_type}
@@ -927,20 +927,20 @@ If any of init_t, exit_t or alpha_t is specified, the user schedule, with a fixe
927927
**Default:** ``move_block_type``
928928

929929
.. option:: --place_quench_only {on | off}
930-
930+
931931
If this option is set to ``on``, the placement will skip the annealing phase and only perform the placement quench.
932-
This option is useful when the the quality of initial placement is good enough and there is no need to perform the
932+
This option is useful when the the quality of initial placement is good enough and there is no need to perform the
933933
annealing phase.
934934

935935
**Default:** ``off``
936936

937937

938938
.. option:: --placer_debug_block <int>
939-
939+
940940
.. note:: This option is likely only of interest to developers debugging the placement algorithm
941941

942-
Controls which block the placer produces detailed debug information for.
943-
942+
Controls which block the placer produces detailed debug information for.
943+
944944
If the block being moved has the same ID as the number assigned to this parameter, the placer will print debugging information about it.
945945

946946
* For values >= 0, the value is the block ID for which detailed placer debug information should be produced.
@@ -952,7 +952,7 @@ If any of init_t, exit_t or alpha_t is specified, the user schedule, with a fixe
952952
**Default:** ``-2``
953953

954954
.. option:: --placer_debug_net <int>
955-
955+
956956
.. note:: This option is likely only of interest to developers debugging the placement algorithm
957957

958958
Controls which net the placer produces detailed debug information for.
@@ -996,7 +996,7 @@ The following options are only valid when the placement engine is in timing-driv
996996

997997
.. option:: --quench_recompute_divider <int>
998998

999-
Controls how many times the placer performs a timing analysis to update its criticality estimates during a quench.
999+
Controls how many times the placer performs a timing analysis to update its criticality estimates during a quench.
10001000
If unspecified, uses the value from --inner_loop_recompute_divider.
10011001

10021002
**Default:** ``0``
@@ -1080,7 +1080,7 @@ The following options are only valid when the placement engine is in timing-driv
10801080

10811081
NoC Options
10821082
^^^^^^^^^^^^^^
1083-
The following options are only used when FPGA device and netlist contain a NoC router.
1083+
The following options are only used when FPGA device and netlist contain a NoC router.
10841084

10851085
.. option:: --noc {on | off}
10861086

@@ -1090,15 +1090,15 @@ The following options are only used when FPGA device and netlist contain a NoC r
10901090
**Default:** ``off``
10911091

10921092
.. option:: --noc_flows_file <file>
1093-
1093+
10941094
XML file containing the list of traffic flows within the NoC (communication between routers).
10951095

10961096
.. note:: noc_flows_file are required to specify if NoC optimization is turned on (--noc on).
10971097

10981098
.. option:: --noc_routing_algorithm {xy_routing | bfs_routing | west_first_routing | north_last_routing | negative_first_routing | odd_even_routing}
10991099

11001100
Controls the algorithm used by the NoC to route packets.
1101-
1101+
11021102
* ``xy_routing`` Uses the direction oriented routing algorithm. This is recommended to be used with mesh NoC topologies.
11031103
* ``bfs_routing`` Uses the breadth first search algorithm. The objective is to find a route that uses a minimum number of links. This algorithm is not guaranteed to generate deadlock-free traffic flow routes, but can be used with any NoC topology.
11041104
* ``west_first_routing`` Uses the west-first routing algorithm. This is recommended to be used with mesh NoC topologies.
@@ -1111,11 +1111,11 @@ The following options are only used when FPGA device and netlist contain a NoC r
11111111
.. option:: --noc_placement_weighting <float>
11121112

11131113
Controls the importance of the NoC placement parameters relative to timing and wirelength of the design.
1114-
1114+
11151115
* ``noc_placement_weighting = 0`` means the placement is based solely on timing and wirelength.
11161116
* ``noc_placement_weighting = 1`` means noc placement is considered equal to timing and wirelength.
11171117
* ``noc_placement_weighting > 1`` means the placement is increasingly dominated by NoC parameters.
1118-
1118+
11191119
**Default:** ``5.0``
11201120

11211121
.. option:: --noc_aggregate_bandwidth_weighting <float>
@@ -1133,7 +1133,7 @@ The following options are only used when FPGA device and netlist contain a NoC r
11331133
Other positive numbers specify the importance of meeting latency constraints compared to other NoC-related cost terms.
11341134
Weighting factors for NoC-related cost terms are normalized internally. Therefore, their absolute values are not important, and
11351135
only their relative ratios determine the importance of each cost term.
1136-
1136+
11371137
**Default:** ``0.6``
11381138

11391139
.. option:: --noc_latency_weighting <float>
@@ -1143,7 +1143,7 @@ The following options are only used when FPGA device and netlist contain a NoC r
11431143
Other positive numbers specify the importance of minimizing aggregate latency compared to other NoC-related cost terms.
11441144
Weighting factors for NoC-related cost terms are normalized internally. Therefore, their absolute values are not important, and
11451145
only their relative ratios determine the importance of each cost term.
1146-
1146+
11471147
**Default:** ``0.02``
11481148

11491149
.. option:: --noc_congestion_weighting <float>
@@ -1159,11 +1159,11 @@ The following options are only used when FPGA device and netlist contain a NoC r
11591159
.. option:: --noc_swap_percentage <float>
11601160

11611161
Sets the minimum fraction of swaps attempted by the placer that are NoC blocks.
1162-
This value is an integer ranging from [0-100].
1163-
1164-
* ``0`` means NoC blocks will be moved at the same rate as other blocks.
1162+
This value is an integer ranging from [0-100].
1163+
1164+
* ``0`` means NoC blocks will be moved at the same rate as other blocks.
11651165
* ``100`` means all swaps attempted by the placer are NoC router blocks.
1166-
1166+
11671167
**Default:** ``0``
11681168

11691169
.. option:: --noc_placement_file_name <file>
@@ -1249,7 +1249,7 @@ Analytical Placement is generally split into three stages:
12491249

12501250
* ``none`` Do not use any Detailed Placer.
12511251

1252-
* ``annealer`` Use the Annealer from the Placement stage as a Detailed Placer. This will use the same Placer Options from the Place stage to configure the annealer.
1252+
* ``annealer`` Use the Annealer from the Placement stage as a Detailed Placer. This will use the same Placer Options from the Place stage to configure the annealer.
12531253

12541254
**Default:** ``annealer``
12551255

@@ -1326,8 +1326,8 @@ VPR uses a negotiated congestion algorithm (based on Pathfinder) to perform rout
13261326

13271327
.. option:: --max_pres_fac <float>
13281328

1329-
Sets the maximum present overuse penalty factor that can ever result during routing. Should always be less than 1e25 or so to prevent overflow.
1330-
Smaller values may help prevent circuitous routing in difficult routing problems, but may increase
1329+
Sets the maximum present overuse penalty factor that can ever result during routing. Should always be less than 1e25 or so to prevent overflow.
1330+
Smaller values may help prevent circuitous routing in difficult routing problems, but may increase
13311331
the number of routing iterations needed and hence runtime.
13321332

13331333
**Default:** ``1000.0``
@@ -1406,7 +1406,7 @@ VPR uses a negotiated congestion algorithm (based on Pathfinder) to perform rout
14061406

14071407
.. option:: --router_algorithm {timing_driven | parallel | parallel_decomp}
14081408

1409-
Selects which router algorithm to use.
1409+
Selects which router algorithm to use.
14101410

14111411
* ``timing_driven`` is the default single-threaded PathFinder algorithm.
14121412

@@ -1488,13 +1488,84 @@ The following options are only valid when the router is in timing-driven mode (t
14881488
**Default:** ``0.0``
14891489

14901490
.. option:: --router_profiler_astar_fac <float>
1491-
1491+
14921492
Controls the directedness of the timing-driven router's exploration when doing router delay profiling of an architecture.
14931493
The router delay profiling step is currently used to calculate the place delay matrix lookup.
14941494
Values between 1 and 2 are resonable; higher values trade some quality for reduced run-time.
14951495

14961496
**Default:** ``1.2``
14971497

1498+
.. option:: --enable_parallel_connection_router {on | off}
1499+
1500+
Controls whether the MultiQueue-based parallel connection router is used during a single connection routing.
1501+
1502+
When enabled, the parallel connection router accelerates the path search for individual source-sink connections using
1503+
multi-threading without altering the net routing order.
1504+
1505+
**Default:** ``off``
1506+
1507+
.. option:: --post_target_prune_fac <float>
1508+
1509+
Controls the post-target pruning heuristic calculation in the parallel connection router.
1510+
1511+
This parameter is used as a multiplicative factor applied to the VPR heuristic (not guaranteed to be admissible, i.e.,
1512+
might over-predict the cost to the sink) to calculate the 'stopping heuristic' when pruning nodes after the target has
1513+
been reached. The 'stopping heuristic' must be admissible for the path search algorithm to guarantee optimal paths and
1514+
be deterministic.
1515+
1516+
Values of this parameter are architecture-specific and have to be empirically found.
1517+
1518+
This parameter has no effect if :option:`--enable_parallel_connection_router` is not set.
1519+
1520+
**Default:** ``1.2``
1521+
1522+
.. option:: --post_target_prune_offset <float>
1523+
1524+
Controls the post-target pruning heuristic calculation in the parallel connection router.
1525+
1526+
This parameter is used as a subtractive offset together with :option:`--post_target_prune_fac` to apply an affine
1527+
transformation on the VPR heuristic to calculate the 'stopping heuristic'. The 'stopping heuristic' must be admissible
1528+
for the path search algorithm to guarantee optimal paths and be deterministic.
1529+
1530+
Values of this parameter are architecture-specific and have to be empirically found.
1531+
1532+
This parameter has no effect if :option:`--enable_parallel_connection_router` is not set.
1533+
1534+
**Default:** ``0.0``
1535+
1536+
.. option:: --multi_queue_num_threads <int>
1537+
1538+
Controls the number of threads used by MultiQueue-based parallel connection router.
1539+
1540+
If not explicitly specified, defaults to 1, implying the parallel connection router works in 'serial' mode using only
1541+
one main thread to route.
1542+
1543+
This parameter has no effect if :option:`--enable_parallel_connection_router` is not set.
1544+
1545+
**Default:** ``1``
1546+
1547+
.. option:: --multi_queue_num_queues <int>
1548+
1549+
Controls the number of queues used by MultiQueue in the parallel connection router.
1550+
1551+
Must be set >= 2. A common configuration for this parameter is the number of threads used by MultiQueue * 4 (the number
1552+
of queues per thread).
1553+
1554+
This parameter has no effect if :option:`--enable_parallel_connection_router` is not set.
1555+
1556+
**Default:** ``2``
1557+
1558+
.. option:: --multi_queue_direct_draining {on | off}
1559+
1560+
Controls whether to enable queue draining optimization for MultiQueue-based parallel connection router.
1561+
1562+
When enabled, queues can be emptied quickly by draining all elements if no further solutions need to be explored in the
1563+
path search to guarantee optimality or determinism after reaching the target.
1564+
1565+
This parameter has no effect if :option:`--enable_parallel_connection_router` is not set.
1566+
1567+
**Default:** ``off``
1568+
14981569
.. option:: --max_criticality <float>
14991570

15001571
Sets the maximum fraction of routing cost that can come from delay (vs. coming from routability) for any net.

‎vpr/src/base/read_options.cpp

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2704,8 +2704,8 @@ argparse::ArgumentParser create_arg_parser(const std::string& prog_name, t_optio
27042704

27052705
route_timing_grp.add_argument<bool, ParseOnOff>(args.enable_parallel_connection_router, "--enable_parallel_connection_router")
27062706
.help(
2707-
"Controls whether the parallel connection router is used during a single connection routing."
2708-
" When enabled, the parallel connection router accelerates the path search for individual"
2707+
"Controls whether the MultiQueue-based parallel connection router is used during a single connection"
2708+
" routing. When enabled, the parallel connection router accelerates the path search for individual"
27092709
" source-sink connections using multi-threading without altering the net routing order.")
27102710
.default_value("off")
27112711
.show_in(argparse::ShowIn::HELP_ONLY);

0 commit comments

Comments
 (0)
Please sign in to comment.