Skip to content

Commit 2bdc0e9

Browse files
authored
Merge branch 'master' into patch-1
2 parents e6b4a62 + 920e8ab commit 2bdc0e9

File tree

265 files changed

+15183
-5952
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

265 files changed

+15183
-5952
lines changed

.github/workflows/nightly_test.yml

Lines changed: 17 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -4,20 +4,23 @@ on:
44
# We want to run the CI when anything is pushed to master.
55
# Since master is a protected branch this only happens when a PR is merged.
66
# This is a double check in case the PR was stale and had some issues.
7-
push:
8-
branches:
9-
- master
10-
paths-ignore: # Prevents from running if only docs are updated
11-
- 'doc/**'
12-
- '**/*README*'
13-
- '**.md'
14-
- '**.rst'
15-
pull_request:
16-
paths-ignore: # Prevents from running if only docs are updated
17-
- 'doc/**'
18-
- '**/*README*'
19-
- '**.md'
20-
- '**.rst'
7+
# NOTE: This was turned off in late October 2024 since the Nightly Tests were
8+
# no longer working on the self-hosted runners. Will turn this back on
9+
# once the issue is resolved.
10+
# push:
11+
# branches:
12+
# - master
13+
# paths-ignore: # Prevents from running if only docs are updated
14+
# - 'doc/**'
15+
# - '**/*README*'
16+
# - '**.md'
17+
# - '**.rst'
18+
# pull_request:
19+
# paths-ignore: # Prevents from running if only docs are updated
20+
# - 'doc/**'
21+
# - '**/*README*'
22+
# - '**.md'
23+
# - '**.rst'
2124
workflow_dispatch:
2225
schedule:
2326
- cron: '0 0 * * *' # daily

doc/src/arch/reference.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2337,8 +2337,8 @@ The ``<direct>`` tag and its contents are described below.
23372337
:req_param y_offset: The y location of the receiving CLB relative to the driving CLB.
23382338
:req_param z_offset: The z location of the receiving CLB relative to the driving CLB.
23392339
:opt_param switch_name: [Optional, defaults to delay-less switch if not specified] The name of the ``<switch>`` from ``<switchlist>`` to be used for this direct connection.
2340-
:opt_param from_side: The associated from_pin's block size (must be one of ``left``, ``right``, ``top``, ``bottom`` or left unspecified)
2341-
:opt_param to_side: The associated to_pin's block size (must be one of ``left``, ``right``, ``top``, ``bottom`` or left unspecified)
2340+
:opt_param from_side: The associated from_pin's block side (must be one of ``left``, ``right``, ``top``, ``bottom`` or left unspecified)
2341+
:opt_param to_side: The associated to_pin's block side (must be one of ``left``, ``right``, ``top``, ``bottom`` or left unspecified)
23422342

23432343
Describes a dedicated connection between two complex block pins that skips general interconnect.
23442344
This is useful for describing structures such as carry chains as well as adjacent neighbour connections.
-34.9 KB
Loading

doc/src/quickstart/index.rst

Lines changed: 188 additions & 130 deletions
Large diffs are not rendered by default.

doc/src/quickstart/tseng_blk1.png

-21.6 KB
Loading

doc/src/quickstart/tseng_nets.png

2.86 KB
Loading

doc/src/vpr/command_line_usage.rst

Lines changed: 37 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1074,12 +1074,16 @@ The following options are only used when FPGA device and netlist contain a NoC r
10741074

10751075
.. note:: noc_flows_file are required to specify if NoC optimization is turned on (--noc on).
10761076

1077-
.. option:: --noc_routing_algorithm {xy_routing | bfs_routing}
1077+
.. option:: --noc_routing_algorithm {xy_routing | bfs_routing | west_first_routing | north_last_routing | negative_first_routing | odd_even_routing}
10781078

10791079
Controls the algorithm used by the NoC to route packets.
10801080

10811081
* ``xy_routing`` Uses the direction oriented routing algorithm. This is recommended to be used with mesh NoC topologies.
1082-
* ``bfs_routing`` Uses the breadth first search algorithm. The objective is to find a route that uses a minimum number of links. This can be used with any NoC topology.
1082+
* ``bfs_routing`` Uses the breadth first search algorithm. The objective is to find a route that uses a minimum number of links. This algorithm is not guaranteed to generate deadlock-free traffic flow routes, but can be used with any NoC topology.
1083+
* ``west_first_routing`` Uses the west-first routing algorithm. This is recommended to be used with mesh NoC topologies.
1084+
* ``north_last_routing`` Uses the north-last routing algorithm. This is recommended to be used with mesh NoC topologies.
1085+
* ``negative_first_routing`` Uses the negative-first routing algorithm. This is recommended to be used with mesh NoC topologies.
1086+
* ``odd_even_routing`` Uses the odd-even routing algorithm. This is recommended to be used with mesh NoC topologies.
10831087

10841088
**Default:** ``bfs_routing``
10851089

@@ -1091,28 +1095,45 @@ The following options are only used when FPGA device and netlist contain a NoC r
10911095
* ``noc_placement_weighting = 1`` means noc placement is considered equal to timing and wirelength.
10921096
* ``noc_placement_weighting > 1`` means the placement is increasingly dominated by NoC parameters.
10931097

1094-
**Default:** ``0.6``
1098+
**Default:** ``5.0``
1099+
1100+
.. option:: --noc_aggregate_bandwidth_weighting <float>
1101+
1102+
Controls the importance of minimizing the NoC aggregate bandwidth. This value can be >=0, where 0 would mean the aggregate bandwidth has no relevance to placement.
1103+
Other positive numbers specify the importance of minimizing the NoC aggregate bandwidth compared to other NoC-related cost terms.
1104+
Weighting factors for NoC-related cost terms are normalized internally. Therefore, their absolute values are not important, and
1105+
only their relative ratios determine the importance of each cost term.
1106+
1107+
**Default:** ``0.38``
10951108

10961109
.. option:: --noc_latency_constraints_weighting <float>
10971110

1098-
Controls the importance of meeting all the NoC traffic flow latency constraints.
1111+
Controls the importance of meeting all the NoC traffic flow latency constraints. This value can be >=0, where 0 would mean latency constraints have no relevance to placement.
1112+
Other positive numbers specify the importance of meeting latency constraints compared to other NoC-related cost terms.
1113+
Weighting factors for NoC-related cost terms are normalized internally. Therefore, their absolute values are not important, and
1114+
only their relative ratios determine the importance of each cost term.
10991115

1100-
* ``latency_constraints = 0`` means the latency constraints have no relevance to placement.
1101-
* ``0 < latency_constraints < 1`` means the latency constraints are weighted equally to the sum of other placement cost components.
1102-
* ``latency_constraints > 1`` means the placement is increasingly dominated by reducing the latency constraints of the traffic flows.
1103-
1104-
**Default:** ``1``
1116+
**Default:** ``0.6``
11051117

11061118
.. option:: --noc_latency_weighting <float>
11071119

11081120
Controls the importance of reducing the latencies of the NoC traffic flows.
1109-
This value can be >=0,
1121+
This value can be >=0, where 0 would mean the latencies have no relevance to placement
1122+
Other positive numbers specify the importance of minimizing aggregate latency compared to other NoC-related cost terms.
1123+
Weighting factors for NoC-related cost terms are normalized internally. Therefore, their absolute values are not important, and
1124+
only their relative ratios determine the importance of each cost term.
11101125

1111-
* ``latency = 0`` means the latencies have no relevance to placement.
1112-
* ``0 < latency < 1`` means the latencies are weighted equally to the sum of other placement cost components.
1113-
* ``latency > 1`` means the placement is increasingly dominated by reducing the latencies of the traffic flows.
1114-
1115-
**Default:** ``0.05``
1126+
**Default:** ``0.02``
1127+
1128+
.. option:: --noc_congestion_weighting <float>
1129+
1130+
Controls the importance of reducing the congestion of the NoC links.
1131+
This value can be >=0, where 0 would mean the congestion has no relevance to placement.
1132+
Other positive numbers specify the importance of minimizing congestion compared to other NoC-related cost terms.
1133+
Weighting factors for NoC-related cost terms are normalized internally. Therefore, their absolute values are not important, and
1134+
only their relative ratios determine the importance of each cost term.
1135+
1136+
**Default:** ``0.25``
11161137

11171138
.. option:: --noc_swap_percentage <float>
11181139

@@ -1122,7 +1143,7 @@ The following options are only used when FPGA device and netlist contain a NoC r
11221143
* ``0`` means NoC blocks will be moved at the same rate as other blocks.
11231144
* ``100`` means all swaps attempted by the placer are NoC router blocks.
11241145

1125-
**Default:** ``40``
1146+
**Default:** ``0``
11261147

11271148
.. option:: --noc_placement_file_name <file>
11281149

libs/EXTERNAL/libblifparse/CMakeLists.txt

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,10 @@ add_library(libblifparse STATIC
4545
target_include_directories(libblifparse PUBLIC ${LIB_INCLUDE_DIRS} ${CMAKE_CURRENT_BINARY_DIR})
4646
set_target_properties(libblifparse PROPERTIES PREFIX "") #Avoid extra 'lib' prefix
4747

48+
# Set the read buffer size in the generated lexers. This reduces the number of
49+
# syscalls since the default is only 1kB.
50+
target_compile_definitions(libblifparse PRIVATE YY_READ_BUF_SIZE=1048576)
51+
4852
#Create the test executable
4953
add_executable(blifparse_test src/main.cpp)
5054
target_link_libraries(blifparse_test libblifparse)

libs/EXTERNAL/libezgl/CMakeLists.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
cmake_minimum_required(VERSION 3.9 FATAL_ERROR)
1+
cmake_minimum_required(VERSION 3.10 FATAL_ERROR)
22

33
# create the project
44
project(

libs/EXTERNAL/libezgl/examples/basic-application/CMakeLists.txt

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
cmake_minimum_required(VERSION 3.9 FATAL_ERROR)
1+
cmake_minimum_required(VERSION 3.10 FATAL_ERROR)
22

33
project(
44
basic-application

libs/EXTERNAL/libtatum/libtatum/tatum/TimingGraph.cpp

Lines changed: 11 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -200,7 +200,7 @@ NodeId TimingGraph::add_node(const NodeType type) {
200200

201201
EdgeId TimingGraph::add_edge(const EdgeType type, const NodeId src_node, const NodeId sink_node) {
202202
//We require that the source/sink node must already be in the graph,
203-
// so we can update them with thier edge references
203+
// so we can update them with their edge references
204204
TATUM_ASSERT(valid_node_id(src_node));
205205
TATUM_ASSERT(valid_node_id(sink_node));
206206

@@ -211,7 +211,7 @@ EdgeId TimingGraph::add_edge(const EdgeType type, const NodeId src_node, const N
211211
EdgeId edge_id = EdgeId(edge_ids_.size());
212212
edge_ids_.push_back(edge_id);
213213

214-
//Create the edgge
214+
//Create the edge
215215
edge_types_.push_back(type);
216216
edge_src_nodes_.push_back(src_node);
217217
edge_sink_nodes_.push_back(sink_node);
@@ -318,7 +318,7 @@ GraphIdMaps TimingGraph::compress() {
318318
levelize();
319319
validate();
320320

321-
return {node_id_map, edge_id_map};
321+
return {std::move(node_id_map), std::move(edge_id_map)};
322322
}
323323

324324
void TimingGraph::levelize() {
@@ -474,21 +474,20 @@ GraphIdMaps TimingGraph::optimize_layout() {
474474

475475
levelize();
476476

477-
return {node_id_map, edge_id_map};
477+
return {std::move(node_id_map), std::move(edge_id_map)};
478478
}
479479

480480
tatum::util::linear_map<EdgeId,EdgeId> TimingGraph::optimize_edge_layout() const {
481481
//Make all edges in a level be contiguous in memory
482482

483483
//Determine the edges driven by each level of the graph
484-
std::vector<std::vector<EdgeId>> edge_levels;
484+
std::vector<std::vector<EdgeId>> edge_levels(levels().size());
485485
for(LevelId level_id : levels()) {
486-
edge_levels.push_back(std::vector<EdgeId>());
487-
for(auto node_id : level_nodes(level_id)) {
486+
for(NodeId node_id : level_nodes(level_id)) {
488487

489488
//We walk the nodes according to the input-edge order.
490489
//This is the same order used by the arrival-time traversal (which is responsible
491-
//for most of the analyzer run-time), so matching it's order exactly results in
490+
//for most of the analyzer run-time), so matching its order exactly results in
492491
//better cache locality
493492
for(EdgeId edge_id : node_in_edges(node_id)) {
494493

@@ -498,7 +497,7 @@ tatum::util::linear_map<EdgeId,EdgeId> TimingGraph::optimize_edge_layout() const
498497
}
499498
}
500499

501-
//Maps from from original to new edge id, used to update node to edge refs
500+
//Maps from original to new edge id, used to update node to edge refs
502501
tatum::util::linear_map<EdgeId,EdgeId> orig_to_new_edge_id(edges().size());
503502

504503
//Determine the new order
@@ -874,7 +873,7 @@ std::vector<std::vector<NodeId>> identify_combinational_loops(const TimingGraph&
874873
}
875874

876875
std::vector<NodeId> find_transitively_connected_nodes(const TimingGraph& tg,
877-
const std::vector<NodeId> through_nodes,
876+
const std::vector<NodeId>& through_nodes,
878877
size_t max_depth) {
879878
std::vector<NodeId> nodes;
880879

@@ -890,7 +889,7 @@ std::vector<NodeId> find_transitively_connected_nodes(const TimingGraph& tg,
890889
}
891890

892891
std::vector<NodeId> find_transitive_fanin_nodes(const TimingGraph& tg,
893-
const std::vector<NodeId> sinks,
892+
const std::vector<NodeId>& sinks,
894893
size_t max_depth) {
895894
std::vector<NodeId> nodes;
896895

@@ -905,7 +904,7 @@ std::vector<NodeId> find_transitive_fanin_nodes(const TimingGraph& tg,
905904
}
906905

907906
std::vector<NodeId> find_transitive_fanout_nodes(const TimingGraph& tg,
908-
const std::vector<NodeId> sources,
907+
const std::vector<NodeId>& sources,
909908
size_t max_depth) {
910909
std::vector<NodeId> nodes;
911910

libs/EXTERNAL/libtatum/libtatum/tatum/TimingGraph.hpp

Lines changed: 21 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -11,8 +11,8 @@
1111
* store all edges as bi-directional edges.
1212
*
1313
* NOTE: We store only the static connectivity and node information in the 'TimingGraph' class.
14-
* Other dynamic information (edge delays, node arrival/required times) is stored seperately.
15-
* This means that most actions opearting on the timing graph (e.g. TimingAnalyzers) only
14+
* Other dynamic information (edge delays, node arrival/required times) is stored separately.
15+
* This means that most actions operating on the timing graph (e.g. TimingAnalyzers) only
1616
* require read-only access to the timing graph.
1717
*
1818
* Accessing Graph Data
@@ -28,9 +28,9 @@
2828
* rather than the more typical "Array of Structs (AoS)" data layout.
2929
*
3030
* By using a SoA layout we keep all data for a particular field (e.g. node types) in contiguous
31-
* memory. Using an AoS layout the various fields accross nodes would *not* be contiguous
31+
* memory. Using an AoS layout the various fields across nodes would *not* be contiguous
3232
* (although the different fields within each object (e.g. a TimingNode class) would be contiguous.
33-
* Since we typically perform operations on particular fields accross nodes the SoA layout performs
33+
* Since we typically perform operations on particular fields across nodes the SoA layout performs
3434
* better (and enables memory ordering optimizations). The edges are also stored in a SOA format.
3535
*
3636
* The SoA layout also motivates the ID based approach, which allows direct indexing into the required
@@ -48,11 +48,12 @@
4848
* and ensures that each cache line pulled into the cache will (likely) be accessed multiple times
4949
* before being evicted.
5050
*
51-
* Note that performing these optimizations is currently done explicity by calling the optimize_edge_layout()
52-
* and optimize_node_layout() member functions. In the future (particularily if incremental modification
51+
* Note that performing these optimizations is currently done explicitly by calling the optimize_edge_layout()
52+
* and optimize_node_layout() member functions. In the future (particularly if incremental modification
5353
* support is added), it may be a good idea apply these modifications automatically as needed.
5454
*
5555
*/
56+
#include <utility>
5657
#include <vector>
5758
#include <set>
5859
#include <limits>
@@ -149,7 +150,7 @@ class TimingGraph {
149150

150151
///\pre The graph must be levelized.
151152
///\returns A range containing the nodes which are primary inputs (i.e. SOURCE's with no fanin, corresponding to top level design inputs pins)
152-
///\warning Not all SOURCE nodes in the graph are primary inputs (e.g. FF Q pins are SOURCE's but have incomming edges from the clock network)
153+
///\warning Not all SOURCE nodes in the graph are primary inputs (e.g. FF Q pins are SOURCE's but have incoming edges from the clock network)
153154
///\see levelize()
154155
node_range primary_inputs() const {
155156
TATUM_ASSERT_MSG(is_levelized_, "Timing graph must be levelized");
@@ -282,7 +283,7 @@ class TimingGraph {
282283
//Node data
283284
tatum::util::linear_map<NodeId,NodeId> node_ids_; //The node IDs in the graph
284285
tatum::util::linear_map<NodeId,NodeType> node_types_; //Type of node
285-
tatum::util::linear_map<NodeId,std::vector<EdgeId>> node_in_edges_; //Incomiing edge IDs for node
286+
tatum::util::linear_map<NodeId,std::vector<EdgeId>> node_in_edges_; //Incoming edge IDs for node
286287
tatum::util::linear_map<NodeId,std::vector<EdgeId>> node_out_edges_; //Out going edge IDs for node
287288
tatum::util::linear_map<NodeId,LevelId> node_levels_; //Out going edge IDs for node
288289

@@ -293,12 +294,12 @@ class TimingGraph {
293294
tatum::util::linear_map<EdgeId,NodeId> edge_src_nodes_; //Source node for each edge
294295
tatum::util::linear_map<EdgeId,bool> edges_disabled_;
295296

296-
//Auxilary graph-level info, filled in by levelize()
297+
//Auxiliary graph-level info, filled in by levelize()
297298
tatum::util::linear_map<LevelId,LevelId> level_ids_; //The level IDs in the graph
298299
tatum::util::linear_map<LevelId,std::vector<NodeId>> level_nodes_; //Nodes in each level
299300
std::vector<NodeId> primary_inputs_; //Primary input nodes of the timing graph.
300301
std::vector<NodeId> logical_outputs_; //Logical output nodes of the timing graph.
301-
bool is_levelized_ = false; //Inidcates if the current levelization is valid
302+
bool is_levelized_ = false; //Indicates if the current levelization is valid
302303

303304
bool allow_dangling_combinational_nodes_ = false;
304305

@@ -310,26 +311,31 @@ std::vector<std::vector<NodeId>> identify_combinational_loops(const TimingGraph&
310311
//Returns the set of nodes transitively connected (either fanin or fanout) to nodes in through_nodes
311312
//up to max_depth (default infinite) hops away
312313
std::vector<NodeId> find_transitively_connected_nodes(const TimingGraph& tg,
313-
const std::vector<NodeId> through_nodes,
314+
const std::vector<NodeId>& through_nodes,
314315
size_t max_depth=std::numeric_limits<size_t>::max());
315316

316317
//Returns the set of nodes in the transitive fanin of nodes in sinks up to max_depth (default infinite) hops away
317318
std::vector<NodeId> find_transitive_fanin_nodes(const TimingGraph& tg,
318-
const std::vector<NodeId> sinks,
319+
const std::vector<NodeId>& sinks,
319320
size_t max_depth=std::numeric_limits<size_t>::max());
320321

321322
//Returns the set of nodes in the transitive fanout of nodes in sources up to max_depth (default infinite) hops away
322323
std::vector<NodeId> find_transitive_fanout_nodes(const TimingGraph& tg,
323-
const std::vector<NodeId> sources,
324+
const std::vector<NodeId>& sources,
324325
size_t max_depth=std::numeric_limits<size_t>::max());
325326

326327
EdgeType infer_edge_type(const TimingGraph& tg, EdgeId edge);
327328

328329
//Mappings from old to new IDs
329330
struct GraphIdMaps {
330-
GraphIdMaps(tatum::util::linear_map<NodeId,NodeId> node_map,
331-
tatum::util::linear_map<EdgeId,EdgeId> edge_map)
331+
GraphIdMaps(const tatum::util::linear_map<NodeId,NodeId>& node_map,
332+
const tatum::util::linear_map<EdgeId,EdgeId>& edge_map)
332333
: node_id_map(node_map), edge_id_map(edge_map) {}
334+
335+
GraphIdMaps(tatum::util::linear_map<NodeId,NodeId>&& node_map,
336+
tatum::util::linear_map<EdgeId,EdgeId>&& edge_map)
337+
: node_id_map(std::move(node_map)), edge_id_map(std::move(edge_map)) {}
338+
333339
tatum::util::linear_map<NodeId,NodeId> node_id_map;
334340
tatum::util::linear_map<EdgeId,EdgeId> edge_id_map;
335341
};

0 commit comments

Comments
 (0)