Skip to content

Commit a0e4a8d

Browse files
committed
Re-commit changes
1 parent a23006e commit a0e4a8d

File tree

171 files changed

+246085
-7087
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

171 files changed

+246085
-7087
lines changed

.gitignore

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -153,4 +153,4 @@ tags
153153
.idea
154154
cmake-build-debug
155155
cmake-build-release
156-
/.metadata/
156+
/.metadata/

BUILDING.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -46,7 +46,7 @@ If you download a different version of those tools, then those versions may not
4646

4747
To verfiy that VTR has been installed correctly run::
4848

49-
./vtr_flow/scripts/run_vtr_task.py regression_tests/vtr_reg_basic/basic_timing
49+
./vtr_flow/scripts/run_vtr_task.py ./vtr_flow/tasks/regression_tests/vtr_reg_basic/basic_timing
5050

5151
The expected output is::
5252

README.developers.md

Lines changed: 73 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -1272,47 +1272,96 @@ make CMAKE_PARAMS="-DVTR_IPO_BUILD=off" -j8 vpr
12721272

12731273
# Profiling VTR
12741274

1275-
1. Install `gprof`, `gprof2dot`, and `xdot`. Specifically, the previous two packages require python3, and you should install the last one with `sudo apt install` for all the dependencies you will need for visualizing your profile results.
1275+
## Use GNU Profiler gprof
1276+
1277+
1. **Installation**: Install `gprof`, `gprof2dot`, and `xdot` (optional).
1278+
1. `gprof` is part of [GNU Binutils](https://www.gnu.org/software/binutils/), which is a commonly-installed package alongside the standard GCC package on most systems. `gprof` should already exist. If not, use `sudo apt install binutils`.
1279+
2. `gprof2dot` requires python3 or conda. You can install with `pip3 install gprof2dot` or `conda install -c conda-forge gprof2dot`.
1280+
3. `xdot` is optional. To install it, use `sudo apt install`.
12761281
```
1277-
pip3 install gprof
1282+
sudo apt install binutils
12781283
pip3 install gprof2dot
1279-
sudo apt install xdot
1284+
sudo apt install xdot # optional
12801285
```
12811286

12821287
Contact your administrator if you do not have the `sudo` rights.
12831288

1284-
2. Use the CMake option below to enable VPR profiler build.
1289+
2. **VPR build**: Use the CMake option below to enable VPR profiler build.
12851290
```
12861291
make CMAKE_PARAMS="-DVTR_ENABLE_PROFILING=ON" vpr
12871292
```
12881293

1289-
3. With the profiler build, each time you run the VTR flow script, it will produce an extra file `gmon.out` that contains the raw profile information.
1290-
Run `gprof` to parse this file. You will need to specify the path to the VPR executable.
1294+
3. **Profiling**:
1295+
1. With the profiler build, each time you run the VTR flow script, it will produce an extra file `gmon.out` that contains the raw profile information. Run `gprof` to parse this file. You will need to specify the path to the VPR executable.
1296+
```
1297+
gprof $VTR_ROOT/vpr/vpr gmon.out > gprof.txt
1298+
```
1299+
1300+
2. Next, use `gprof2dot` to transform the parsed results to a `.dot` file (Graphviz graph description), which describes the graph of your final profile results. If you encounter long function names, specify the `-s` option for a cleaner graph. For other useful options, please refer to its [online documentation](https://github.com/jrfonseca/gprof2dot?tab=readme-ov-file#documentation).
1301+
```
1302+
gprof2dot -s gprof.txt > vpr.dot
1303+
```
1304+
1305+
- Note: You can chain the above commands to directly produce the `.dot` file:
1306+
```
1307+
gprof $VTR_ROOT/vpr/vpr gmon.out | gprof2dot -s > vpr.dot
1308+
```
1309+
1310+
4. **Visualization**:
1311+
- **Option 1** (Recommended): Use the [Edotor](https://edotor.net/) online Graphviz visualizer.
1312+
1. Open a browser and go to [https://edotor.net/](https://edotor.net/) (on any device, not necessarily the one where VPR is running).
1313+
2. Choose `dot` as the "Engine" at the top navigation bar.
1314+
3. Next, copy and paste `vpr.dot` into the editor space on the left side of the web view.
1315+
4. Then, you can interactively (i.e., pan and zoom) view the results and download an SVG or PNG image.
1316+
- **Option 2**: Use the locally-installed `xdot` visualization tool.
1317+
1. Use `xdot` to view your results:
1318+
```
1319+
xdot vpr.dot
1320+
```
1321+
2. To save your results as a PNG file:
1322+
```
1323+
dot -Tpng -Gdpi=300 vpr.dot > vpr.png
1324+
```
1325+
Note that you can use the `-Gdpi` option to make your picture clearer if you find the default dpi settings not clear enough.
1326+
1327+
## Use Linux Perf Tool
1328+
1329+
1. **Installation**: Install `perf` and `gprof2dot` (optional).
12911330
```
1292-
gprof $VTR_ROOT/vpr/vpr gmon.out > gprof.txt
1331+
sudo apt install linux-tools-common linux-tools-generic
1332+
pip3 install gprof2dot # optional
12931333
```
12941334

1295-
4. Next, use `gprof2dot` to transform the parsed results to a `.dot` file, which describes the graph of your final profile results. If you encounter long function names, specify the `-s` option for a cleaner graph.
1335+
2. **VPR build**: *No need* to enable any CMake options for using `perf`, unless you want to utilize specific features, such as `perf annotate`.
12961336
```
1297-
gprof2dot -s gprof.txt > vpr.dot
1337+
make vpr
12981338
```
12991339

1300-
5. You can chain the above commands to directly produce the `.dot` file:
1301-
```
1302-
gprof $VTR_ROOT/vpr/vpr gmon.out | gprof2dot -s > vpr.dot
1303-
```
1340+
3. **Profiling**: `perf` needs to know the process ID (i.e., pid) of the running VPR you want to monitor and profile, which can be obtained using the Linux command `top -u <username>`.
1341+
- **Option 1**: Real-time analysis
1342+
```
1343+
sudo perf top -p <vpr pid>
1344+
```
1345+
- **Option 2** (Recommended): Record and offline analysis
1346+
1347+
Use `perf record` to record the profile data and the call graph. (Note: The argument `lbr` for `--call-graph` only works on Intel platforms. If you encounter issues with call graph recording, please refer to the [`perf record` manual](https://perf.wiki.kernel.org/index.php/Latest_Manual_Page_of_perf-record.1) for more information.)
1348+
```
1349+
sudo perf record --call-graph lbr -p <vpr pid>
1350+
```
1351+
After VPR completes its run, or if you stop `perf` with CTRL+C (if you are focusing on a specific portion of the VPR execution), the `perf` tool will produce an extra file `perf.data` containing the raw profile results in the directory where you ran `perf`. You can further analyze the results by parsing this file using `perf report`.
1352+
```
1353+
sudo perf report -i perf.data
1354+
```
1355+
- Note 1: The official `perf` [wiki](https://perf.wiki.kernel.org/index.php/Main_Page) and [tutorial](https://perf.wiki.kernel.org/index.php/Tutorial) are highly recommended for those who want to explore more uses of the tool.
1356+
- Note 2: It is highly recommended to run `perf` with `sudo`, but you can find a workaround [here](https://superuser.com/questions/980632/run-perf-without-root-rights) to allow running `perf` without root rights.
1357+
- Note 3: You may also find [Hotspot](https://github.com/KDAB/hotspot) useful if you want to run `perf` with GUI support.
1358+
1359+
4. **Visualization** (optional): If you want a better illustration of the profiling results, first run the following command to transform the `perf` report into a Graphviz dot graph. The remaining steps are exactly the same as those described under [Use GNU Profiler gprof
1360+
](#use-gnu-profiler-gprof).
1361+
```
1362+
perf script -i perf.data | c++filt | gprof2dot.py -f perf -s > vpr.dot
1363+
```
13041364

1305-
6. Use `xdot` to view your results:
1306-
```
1307-
xdot vpr.dot
1308-
```
1309-
1310-
7. To save your results as a `png` file:
1311-
```
1312-
dot -Tpng -Gdpi=300 vpr.dot > vpr.png
1313-
```
1314-
1315-
Note that you can use the `-Gdpi` option to make your picture clearer if you find the default dpi settings not clear enough.
13161365

13171366
# External Subtrees
13181367
VTR includes some code which is developed in external repositories, and is integrated into the VTR source tree using [git subtrees](https://www.atlassian.com/blog/git/alternatives-to-git-submodule-git-subtree).

doc/src/api/vprinternals/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,3 +10,4 @@ VPR INTERNALS
1010
vpr_ui
1111
draw_files
1212
vpr_noc
13+
vpr_router
Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
==============
2+
Router Heap
3+
==============
4+
5+
t_heap
6+
----------
7+
.. doxygenstruct:: t_heap
8+
:project: vpr
9+
:members:
10+
11+
HeapInterface
12+
----------
13+
.. doxygenclass:: HeapInterface
14+
:project: vpr
15+
:members:
16+
17+
HeapStorage
18+
----------
19+
.. doxygenclass:: HeapStorage
20+
:project: vpr
21+
:members:
22+
23+
KAryHeap
24+
----------
25+
.. doxygenclass:: KAryHeap
26+
:project: vpr
27+
28+
FourAryHeap
29+
----------
30+
.. doxygenclass:: FourAryHeap
31+
:project: vpr
Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
.. _router:
2+
3+
=======
4+
VPR Router
5+
=======
6+
7+
.. toctree::
8+
:maxdepth: 1
9+
10+
router_heap

doc/src/vpr/placement_constraints.rst

Lines changed: 33 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -28,6 +28,14 @@ A Placement Constraints File Example
2828
<add_atom name_pattern="n4917"/>
2929
<add_atom name_pattern="n6010"/>
3030
</partition>
31+
32+
<partition name="Part2">
33+
<add_region x_low="3" y_low="3" x_high="85" y_high="85"/> <!-- When the layer is not explicitly specified, layer 0 is assumed. -->
34+
<add_region x_low="8" y_low="5" x_high="142" y_high="29 layer_low="0" layer_high="1"/> <!-- In 3D architectures, the region can span across multiple layers. -->
35+
<add_region x_low="6" y_low="55" x_high="50" y_high="129 layer_low="2" layer_high="2"/> <!-- If the region only covers a non-zero layer, both layer_low and layer_high must be set the same value. -->
36+
<add_atom name_pattern="n135"/>
37+
<add_atom name_pattern="n7016"/>
38+
</partition>
3139
</partition_list>
3240
</vpr_constraints>
3341
@@ -75,7 +83,10 @@ The ``name_pattern`` can be the exact name of the atom from the input atom netli
7583
Region
7684
^^^^^^
7785

78-
An ``<add_region>`` tag is used to add a region to the partition. A ``region`` is a rectangular area on the chip. A partition can contain any number of independent regions - the regions within one partition must not overlap with each other (in order to ease processing when loading in the file). An ``<add_region>`` tag has the following attributes.
86+
An ``<add_region>`` tag is used to add a region to the partition. A ``region`` is a rectangular area or cubic volume
87+
on the chip. A partition can contain any number of independent regions - the regions within one partition **must not**
88+
overlap with each other (in order to ease processing when loading in the file).
89+
An ``<add_region>`` tag has the following attributes.
7990

8091
:req_param x_low:
8192
The x value of the lower left point of the rectangle.
@@ -90,11 +101,30 @@ An ``<add_region>`` tag is used to add a region to the partition. A ``region`` i
90101
The y value of the upper right point of the rectangle.
91102

92103
:opt_param subtile:
93-
Each x, y location on the grid may contain multiple locations known as subtiles. This paramter is an optional value specifying the subtile location that the atom(s) of the partition shall be constrained to.
104+
Each x, y location on the grid may contain multiple locations known as subtiles. This parameter is an optional value specifying the subtile location that the atom(s) of the partition shall be constrained to.
105+
106+
:opt_param layer_low:
107+
The lowest layer number that the region covers. The default value is 0.
108+
109+
:opt_param layer_high:
110+
The highest layer number that the region covers. The default value is 0.
94111

95112
The optional ``subtile`` attribute is commonly used when constraining an atom to a specific location on the chip (e.g. an exact I/O location). It is legal to use with larger regions, but uncommon.
96113

97-
If a user would like to specify an area on the chip with an unusual shape (e.g. L-shaped or T-shaped), they can simply add multiple ``<add_region>`` tags to cover the area specified.
114+
In 2D architectures, ``layer_low`` and ``layer_high`` can be safely ignored as their default value is 0.
115+
In 3D architectures, a region can span across multiple layers or be assigned to a specific layer.
116+
For assigning a region to a specific non-zero layer, the user should set both ``layer_low`` and ``layer_high`` to the
117+
desired layer number. If a layer range is to be covered by the region, the user set ``layer_low`` and ``layer_high`` to
118+
different values.
119+
120+
If a user would like to specify an area on the chip with an unusual shape (e.g. L-shaped or T-shaped),
121+
they can simply add multiple ``<add_region>`` tags to cover the area specified.
122+
123+
It is strongly recommended that different partitions do not overlap. The packing algorithm compares the number clustered
124+
blocks and the number of physical blocks in a region to decide pack atoms inside a partition more aggressively when
125+
there are not enough resources in a partition. Overlapping partitions causes some physical blocks to be counted in more
126+
than one partition.
127+
98128

99129

100130

libs/libarchfpga/src/arch_util.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@ const char* get_arch_file_name() {
3535
return arch_file_name;
3636
}
3737

38-
InstPort::InstPort(std::string str) {
38+
InstPort::InstPort(const std::string& str) {
3939
std::vector<std::string> inst_port = vtr::split(str, ".");
4040

4141
if (inst_port.size() == 1) {

libs/libarchfpga/src/arch_util.h

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ class InstPort {
2222
static constexpr int UNSPECIFIED = -1;
2323

2424
InstPort() = default;
25-
InstPort(std::string str);
25+
InstPort(const std::string& str);
2626
std::string instance_name() const { return instance_.name; }
2727
std::string port_name() const { return port_.name; }
2828

libs/libarchfpga/src/echo_arch.cpp

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -236,11 +236,11 @@ void PrintArchInfo(FILE* Echo, const t_arch* arch) {
236236
}
237237

238238
fprintf(Echo, "\tInput Connect Block Switch Name Within a Same Die: %s\n", arch->ipin_cblock_switch_name[ipin_cblock_switch_index_within_die].c_str());
239-
239+
240240
//if there is more than one layer available, print the connection block switch name that is used for connection between two dice
241-
for(const auto& layout : arch->grid_layouts){
241+
for (const auto& layout : arch->grid_layouts) {
242242
int num_layers = (int)layout.layers.size();
243-
if(num_layers > 1){
243+
if (num_layers > 1) {
244244
fprintf(Echo, "\tInput Connect Block Switch Name Between Two Dice: %s\n", arch->ipin_cblock_switch_name[ipin_cblock_switch_index_between_dice].c_str());
245245
}
246246
}
@@ -295,11 +295,11 @@ void PrintArchInfo(FILE* Echo, const t_arch* arch) {
295295
fprintf(Echo, "\t\t\t\ttype unidir mux_name for within die connections: %s\n",
296296
arch->Switches[seg.arch_wire_switch].name.c_str());
297297
//if there is more than one layer available, print the segment switch name that is used for connection between two dice
298-
for(const auto& layout : arch->grid_layouts){
298+
for (const auto& layout : arch->grid_layouts) {
299299
int num_layers = (int)layout.layers.size();
300-
if(num_layers > 1){
300+
if (num_layers > 1) {
301301
fprintf(Echo, "\t\t\t\ttype unidir mux_name for between two dice connections: %s\n",
302-
arch->Switches[seg.arch_opin_between_dice_switch].name.c_str());
302+
arch->Switches[seg.arch_opin_between_dice_switch].name.c_str());
303303
}
304304
}
305305
} else { //Should be bidir

libs/libarchfpga/src/physical_types.h

Lines changed: 10 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -924,10 +924,10 @@ struct t_logical_block_type {
924924
std::vector<t_physical_tile_type_ptr> equivalent_tiles; ///>List of physical tiles at which one could
925925
///>place this type of netlist block.
926926

927-
std::unordered_map<int, t_pb_graph_pin*> pin_logical_num_to_pb_pin_mapping; /* pin_logical_num_to_pb_pin_mapping[pin logical number] -> pb_graph_pin ptr} */
928-
std::unordered_map<const t_pb_graph_pin*, int> primitive_pb_pin_to_logical_class_num_mapping; /* primitive_pb_pin_to_logical_class_num_mapping[pb_graph_pin ptr] -> class logical number */
929-
std::vector<t_class> primitive_logical_class_inf; /* primitive_logical_class_inf[class_logical_number] -> class */
930-
std::unordered_map<const t_pb_graph_node*, t_class_range> pb_graph_node_class_range;
927+
std::unordered_map<int, t_pb_graph_pin*> pin_logical_num_to_pb_pin_mapping; /* pin_logical_num_to_pb_pin_mapping[pin logical number] -> pb_graph_pin ptr} */
928+
std::unordered_map<const t_pb_graph_pin*, int> primitive_pb_pin_to_logical_class_num_mapping; /* primitive_pb_pin_to_logical_class_num_mapping[pb_graph_pin ptr] -> class logical number */
929+
std::vector<t_class> primitive_logical_class_inf; /* primitive_logical_class_inf[class_logical_number] -> class */
930+
std::unordered_map<const t_pb_graph_node*, t_class_range> primitive_pb_graph_node_class_range; /* primitive_pb_graph_node_class_range[primitive_pb_graph_node ptr] -> class range for that primitive*/
931931

932932
// Is this t_logical_block_type empty?
933933
bool is_empty() const;
@@ -1239,6 +1239,12 @@ class t_pb_graph_node {
12391239

12401240
int placement_index;
12411241

1242+
/*
1243+
* There is a root-level pb_graph_node assigned to each logical type. Each logical type can contain multiple primitives.
1244+
* If this pb_graph_node is associated with a primitive, a unique number is assigned to it within the logical block level.
1245+
*/
1246+
int primitive_num = OPEN;
1247+
12421248
/* Contains a collection of mode indices that cannot be used as they produce conflicts during VPR packing stage
12431249
*
12441250
* Illegal modes do arise when children of a graph_node do have inconsistent `edge_modes` with respect to

libs/libarchfpga/src/physical_types_util.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -965,7 +965,7 @@ t_class_range get_pb_graph_node_class_physical_range(t_physical_tile_type_ptr /*
965965
const t_pb_graph_node* pb_graph_node) {
966966
VTR_ASSERT(pb_graph_node->is_primitive());
967967

968-
t_class_range class_range = logical_block->pb_graph_node_class_range.at(pb_graph_node);
968+
t_class_range class_range = logical_block->primitive_pb_graph_node_class_range.at(pb_graph_node);
969969
int logical_block_class_offset = sub_tile->primitive_class_range[sub_tile_relative_cap].at(logical_block).low;
970970

971971
class_range.low += logical_block_class_offset;

0 commit comments

Comments
 (0)