Noc cost normalization #2485

soheilshahrouz · 2024-02-07T00:31:25Z

Description

This PR seprates NoC cost term renormalization from computation. Before this PR, a weighted average of aggregate latency and latnecy overrun was normalized. This PR separates all cost terms and computes one normalization factor for each one.

This PR also includes code to compute NoC congestion, but currently sets its cost to 0 (don't optimize yet).

Related Issue

Motivation and Context

To keep the way NoC normailzation factors are computed consistent with bb and timing cost normalization factors.

How Has This Been Tested?

A parameter sweep was run to find the best combination of weighting factors. The results are compared with the master branch.

Types of changes

[] Bug fix (change which fixes an issue)
[] New feature (change which adds functionality)
[] Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

My change requires a change to the documentation
I have updated the documentation accordingly
I have added tests to cover my changes
All new and existing tests passed

Initial NoC placement PR, which is still open, has changed NoC code. To avoid conflicts in the future, this merge was required.

…rarchy.

…eference.

I moved NocDeltaCost declaration from noc_place_utils.h to place_util.h to resolve a cyclic dependency. Forward declaration of NocDeltaCost and t_placer_costs did not solve the problem as the compiler complained about GridTileLookup.

…, and free_noc_placement_structs() for NoC congestion costs

Added some comments to noc_link.h to explain what each method does.

Some NoC tests were failing due to newly added code for congestion modeling. This commit hopefully fixes them.

…utation

…it_noc_costs, test_recompute_noc_costs to check congestion

…routes after revert

When I pass rr graph and router lookahead files to VPR, it throws an error. capnproto uses mmap to open these files. It seems that multiple processes can access a single file using mmap. However, I cannot trust capnproto. The changes in this commit enhace the vtr task syntax by allowing copying arbitrary files to the temporay directory. This way, I can copy rr graph file and prevent multiple processes accessing the same file.

The previous commits did not work. It seems that capnproto uses PWD environment variable instead of calling getcwd(). popen method changes the working directory, but does not update PWD. I update it manually.

This reverts commit 2d3e642.

soheilshahrouz · 2024-02-07T01:16:21Z

I ran a parameter sweep over NoC weighting factors for aggregate latency, latency overrun, and aggregate banwidth. The sum of weighting factors is always equal to 1. In the following table, aggregate bandwidth weighting factor can be determined by subtracting latency and latency overrun factors from 1. Parameter sweep tables show the results for when the NoC placement weighting factor is set to 5. I also ran experiments with this factor set to 10 and 1, but did not include them here as the trend was similar across different NoC placement weighting factors.

Number of met latency constraints

↓latency \| latency constraint →	0.0	0.2	0.4	0.6	0.8	1.0
0.0	3.69E+01	4.24E+01	4.24E+01	4.24E+01	4.24E+01	4.23E+01
0.2	3.70E+01	4.24E+01	4.24E+01	4.24E+01	4.23E+01	x
0.4	3.76E+01	4.24E+01	4.24E+01	4.22E+01	x	x
0.6	3.71E+01	4.23E+01	4.20E+01	x	x	x
0.8	3.69E+01	4.19E+01	x	x	x	x
1.0	3.74E+01	x	x	x	x	x

The table above shows that when the latency constraint weighting factor is zero, the placer does not care about latency constraints. However, setting this factor to very large numbers does not increase the number of met latency constraints. The latency constraint weighting factor should be greater than aggregate latency factor so that the placer prioritizes meeting latency constraints over minimizing the aggregate latency. Another interesing observation is that the latency constraint factor does not need to be significantly larger than the aggregate latency factor. In the master branch the latency constraint factor is 20x greater than the aggregate latency factor. This significant difference is needed because non-normalized latency cost terms were added together, and latency overrun was sometimes very smaller than the aggregate latency. Therefore, to prioritize meeting latency constraints to reducing aggregate latency, the latency constraint factor should have been set to a much larger value. In this PR, aggregate latency and latency overrun are normalized separately. As a result, the latency constraint factor does not need to be considerably larger than the aggregate latency factor to prioritize meeting constraints over minimizing the aggregate latency.

soheilshahrouz · 2024-02-07T15:10:45Z

Aggregate Latency

↓latency \| latency constraint →	0.0	0.2	0.4	0.6	0.8	1.0
0.0	4.51E-07	4.51E-07	4.51E-07	4.51E-07	4.52E-07	5.58E-07
0.2	4.51E-07	4.51E-07	4.51E-07	4.51E-07	4.54E-07	x
0.4	4.51E-07	4.51E-07	4.51E-07	4.52E-07	x	x
0.6	4.51E-07	4.52E-07	4.55E-07	x	x	x
0.8	4.51E-07	4.54E-07	x	x	x	x
1.0	4.51E-07	x	x	x	x	x

As the table above shows, aggragate latency is not sensitive to the aggregate latency weighting factor as long as the aggregate bandwidth factor is non-zero. This is because minimizing the aggregate bandwidth requires placing traffic flow endpoints close to each other, which indirectly optimizes the aggregate latency at the same time.

soheilshahrouz · 2024-02-07T15:27:41Z

Aggregate Bandwidth

↓latency \| latency constraint →	0.0	0.2	0.4	0.6	0.8	1.0
0.0	8.98E+07	8.98E+07	8.98E+07	8.98E+07	8.99E+07	1.21E+08
0.2	8.98E+07	8.98E+07	8.98E+07	8.98E+07	9.76E+07	x
0.4	8.98E+07	8.98E+07	8.98E+07	9.74E+07	x	x
0.6	8.98E+07	9.00E+07	9.83E+07	x	x	x
0.8	8.98E+07	9.84E+07	x	x	x	x
1.0	9.75E+07	x	x	x	x	x

Although optimizing the aggregate bandwith minimizes aggregate latency, reducing latency does not minimize the aggregate bandwidth. As can be seen in the table above, the aggregate bandwith grows large when the aggregate bandwith weighting factor is zero. Increasing the aggregate latency weighting factor cannot improve the aggregate bandwith when its corresponding weighting factor is set to zero. This is because each traffic flow might have different bandwidths. For example assume there are 9 logical routers in a netlist where a central router send data to other 8 routers. The bandwidth of 4 traffic flows are higher than others. The aggregate latency can be minimized be placing the central router in the middle and surrounding it with other routers. When the aggregate bandwidth weighting factor is zero, the placer neglects traffic flow bandwidths and traffic flows with higher bandwidths may travel multiple hops. When the aggregate bandwidth weighting factor is non-zero, routers which are connected through high bandwidth traffic flows are placed more closely.

soheilshahrouz · 2024-02-07T21:57:28Z

Comparison with master
The table below shows QoR metrics for this branch and master. Complex synthetic benchmarks were used.

branch	Aggregate Bandwidth	Aggregate Latency	of met latency constraints	WL	CPD	Placement time	total swaps
master	57037946.44	2.99E-07	31.916	527055.04	6.769	213.674	3486206.917
this PR	57081463.15	2.99E-07	31.877	523029.40	6.735	217.383	3419078.206
ratio to master	1.001	1.0	0.999	0.992	0.995	1.017	0.981

This PR computes and keeps track of NoC congestion cost. The increase in runtime can be partly attributed to more comlpex cost computation for NoC swaps.

Link to detailed results.

vaughnbetz · 2024-02-12T22:59:02Z

QoR looks good.

vaughnbetz

Looks good, a few commenting changes and examination of one O(Nlinks) loop requested.

vpr/src/base/read_options.cpp

vpr/src/base/vpr_types.h

vpr/src/noc/noc_storage.cpp

vpr/src/noc/noc_storage.h

vpr/src/place/noc_place_utils.h

vpr/src/place/place_util.cpp

vpr/src/place/place_util.h

vtr_flow/scripts/python_libs/vtr/util.py

soheilshahrouz · 2024-02-14T04:55:37Z

@vaughnbetz
Thanks for your helpful comments. I applied them, and the code is ready for review.

vaughnbetz · 2024-02-14T05:44:38Z

Thanks. It looks good and I'm merging it. @soheilshahrouz : I think the documentation (.rst) file for the command line options also needs an update to document the new / changed command line options. The description is pretty much what is in the help for the arg parser. Can you make that update in another PR?

soheilshahrouz added 30 commits January 22, 2024 14:00

Merge branch 'init_noc_sa' into noc_congestion_model

8dc44da

Initial NoC placement PR, which is still open, has changed NoC code. To avoid conflicts in the future, this merge was required.

Merge branch 'noc_qor_doc_issue' into noc_congestion_model

04413bc

Avoid passing place_ctx.block_locs as argument in a function call hie…

508206c

…rarchy.

Add bandwidth and congestion to NoCLink

0bb3ffc

Replaced pointers to g_vpr_ctx.noc().noc_traffic_flows_storage with r…

60a740b

…eference.

compute NoC congestion cost difference for router swap

1519e60

fix syntax errors in NoC tests

79b1391

Use NocDeltaCost instead of passing 3 arguments

774670a

Add operator+=() to t_placer_costs.

c7e3cb6

I moved NocDeltaCost declaration from noc_place_utils.h to place_util.h to resolve a cyclic dependency. Forward declaration of NocDeltaCost and t_placer_costs did not solve the problem as the compiler complained about GridTileLookup.

Updated commit_noc_costs(), allocate_and_load_noc_placement_structs()…

8fcdda5

…, and free_noc_placement_structs() for NoC congestion costs

Modified noc_place_utils.cpp to compute congestion cost

3946029

Use std::unique_ptr to hold the pointer to the routing algorithm.

37a739e

Add calculate_noc_cost()

7a01eff

Add --noc_congestion_weighting command line option

e43ef3d

Compute and print NoC congestion metrics.

b2ec184

Added some comments to noc_link.h to explain what each method does.

Add get_total_congestion_bandwidth_ratio()

017da60

Fix NoC test failure

2ad4b69

Some NoC tests were failing due to newly added code for congestion modeling. This commit hopefully fixes them.

Remove init_chan() call

41af9bc

Update normalization factors during NoC initial placement

e9a27b4

pass strings by reference

f7731d2

Print NoC metrics in print_place_status()

4a22e5b

revert renormalization in initial noc placement

5726f98

Merge branch 'master' into noc_congestion_model

f93cfd9

Update test_check_noc_placement_costs to test congestion

3d41245

Update test_initial_noc_placement to check congested links

5458ba8

Update test_initial_comp_cost_functions to check congestion cost comp…

5eed8ae

…utation

Update test_find_affected_noc_routers_and_update_noc_costs, test_comm…

304de90

…it_noc_costs, test_recompute_noc_costs to check congestion

Updated test_find_affected_noc_routers_and_update_noc_costs to check …

0827d99

…routes after revert

Comment some functions and data structures

4896984

Separate NoC cost computation and normalization

6bede42

soheilshahrouz added 6 commits February 1, 2024 19:18

Update normalization factors during NoC initial placement

4b9d804

parse new noc metrics

37426cb

update PWD environment variable before spawning a subprocess

a440aa5

The previous commits did not work. It seems that capnproto uses PWD environment variable instead of calling getcwd(). popen method changes the working directory, but does not update PWD. I update it manually.

Revert "Add include_temp to vtr task syntax"

a0de5f1

This reverts commit 2d3e642.

revert renormalization during init noc placement

c33a6a8

soheilshahrouz added 3 commits February 7, 2024 10:54

remove ununsed functions

2f078fe

updated default NoC placement weighting factors

bc3557c

Merge branch 'master' into noc_cost_norm

e59bc3c

github-actions bot added VPR VPR FPGA Placement & Routing Tool libarchfpga Library for handling FPGA Architecture descriptions lang-cpp C/C++ code lang-python Python code labels Feb 7, 2024

soheilshahrouz changed the title ~~[WIP] Noc cost norm~~ [WIP] Noc cost normalization Feb 7, 2024

Merge branch 'master' into noc_cost_norm

e752132

soheilshahrouz changed the title ~~[WIP] Noc cost normalization~~ Noc cost normalization Feb 12, 2024

removed unused arguments

619d9e7

vaughnbetz requested changes Feb 12, 2024

View reviewed changes

soheilshahrouz added 3 commits February 13, 2024 16:12

applied PR comments

a92ba80

moved comments from source file to header

b9add7f

fix pylint errors

581c3a4

vaughnbetz merged commit bcc45db into master Feb 14, 2024

vaughnbetz deleted the noc_cost_norm branch February 14, 2024 05:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Noc cost normalization #2485

Noc cost normalization #2485

Uh oh!

soheilshahrouz commented Feb 7, 2024 •

edited by vaughnbetz

Loading

Uh oh!

soheilshahrouz commented Feb 7, 2024 •

edited

Loading

Uh oh!

soheilshahrouz commented Feb 7, 2024

Uh oh!

soheilshahrouz commented Feb 7, 2024

Uh oh!

soheilshahrouz commented Feb 7, 2024 •

edited

Loading

Uh oh!

vaughnbetz commented Feb 12, 2024

Uh oh!

vaughnbetz left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

soheilshahrouz commented Feb 14, 2024

Uh oh!

vaughnbetz commented Feb 14, 2024

Uh oh!

Uh oh!

Noc cost normalization #2485

Noc cost normalization #2485

Uh oh!

Conversation

soheilshahrouz commented Feb 7, 2024 • edited by vaughnbetz Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issue

Motivation and Context

How Has This Been Tested?

Types of changes

Checklist:

Uh oh!

soheilshahrouz commented Feb 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

soheilshahrouz commented Feb 7, 2024

Uh oh!

soheilshahrouz commented Feb 7, 2024

Uh oh!

soheilshahrouz commented Feb 7, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vaughnbetz commented Feb 12, 2024

Uh oh!

vaughnbetz left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

soheilshahrouz commented Feb 14, 2024

Uh oh!

vaughnbetz commented Feb 14, 2024

Uh oh!

Uh oh!

soheilshahrouz commented Feb 7, 2024 •

edited by vaughnbetz

Loading

soheilshahrouz commented Feb 7, 2024 •

edited

Loading

soheilshahrouz commented Feb 7, 2024 •

edited

Loading